GFNOrg / gflownet

Generative Flow Networks
MIT License
606 stars 76 forks source link

Generated Docking Candidates #11

Open haoteli opened 1 year ago

haoteli commented 1 year ago

Dear Authors,

I am deveping a model that uses your dataset with docking energies. We realized that the docking energy from the AutoDock Vina can have quite a significant fluctuation depending on the input optimized molecule geometry, random seed, and the hyperparameters chosen for the docking process. Therefore, this makes it hard for us to reproduce the set of reward/energies obtained from your dataset. Since the verification method can be unstable, instead, we would like to further study your model by looking at some of the top candidates genrated from GFlownet trained on this dataset.

In the paper, you described that $10^{6}$ candidates were generated and the top-1000 were analyzed in terms of their average reward. Could you please kindly provide the $10^{6}$ or just the top-1000 generated ones along with their scores?

Thank you very much! Haote

bengioe commented 1 year ago

Hi Haote, I'm not sure I still have the original results saved from the 2021 paper. I'll check.

Do note that the top-1000 scores we report are strictly in terms of the pretrained proxy's score on them, i.e. we did not rerun docking for the compounds we found. The intention with this benchmark wasn't really to solve candidate generation for docking, but rather just to have a hard enough benchmark to work against. You're correct that AutoDock produces stochastic results.

In terms of AutoDock hyperparameters, I'm assuming you've already seen how we define the geometry here and how we call vina here?

Let me know if that doesn't help.