Closed WoutVanEynde closed 1 year ago
Hi @WoutVanEynde,
I hope this is helpful and thank you for your interest in REINVENT.
Hello,
First of all thanks a lot! I don't have any experience in the field of AI nor in programming, so this helped me to really solve the last pieces of the puzzle!
Just some last questions, if that is alright:
I hope I do not bother you too much with these questions!
Warm regards and thanks in advance, Wout Van Eynde, student KU Leuven
Hi @WoutVanEynde,
A pre-trained generative model is provided in this repository in models/random.prior.new. It can be used as is in the "prior" and "agent" fields in the REINVENT configuration JSON. It is trained on ChEMBL, which is an open-source database for biologically active molecules. More information can be found on their website: https://www.ebi.ac.uk/chembl/
Docking scores can be directory optimized in REINVENT using DockStream
which is a wrapper software around various docking algorithms. Using DockStream
as a component to the scoring function in REINVENT will allow you to directly optimize docking scores. There is a tutorial notebook in this repository: Reinforcement_Learning_Demo_DockStream
. In that particular notebook, it goes over the bare minimum of how to set-up the DockStream
component in the REINVENT scoring function to optimize docking scores. The docking algorithm used there is Glide
which is licensed by Schrodinger
. Therefore, you would need a license to use that. However, DockStream
supports a total of 5 docking backends: AutoDock Vina
, rDock
, GOLD, Glide
, and OpenEye Hybrid
. AutoDock Vina
and rDock
are open-source docking software. For information on how to set-up the configuration, see the DockStream
and DockStreamCommunity
repositories which are part of the MolecularAI group. The latter has tutorial notebooks just like this repository.
Once you have a docking protocol chosen and set-up, you will need to choose a score transformation
. Every component in REINVENT is transformed to a score [0, 1]. Therefore, raw docking scores will also need to be transformed: see the Score_Transformations
notebook in this repository for details.
Finally, running docking directly in REINVENT will take longer computation time as every single SMILES proposed at every single epoch needs to be run through the docking algorithm. The time it takes will depend on which docking algorithm you use.
Let me know if this helps you set-up your experiment.
Dear
This has helped me a lot! I cannot express my gratitude enough! I will try to set up a workflow the coming days, but I think I should be fine now!
Warm regards, Wout Van Eynde
@WoutVanEynde If possible, please list your workflow here for your new project. I think that it will help the other users a lot. Many thanks.
Hi,
For the pre-trained generative model provided in this repository in models/random.prior.new, could you please provide more details of the training process/protocol of this chembl prior model? I only found the data preparation of the chembl dataset in your relative papers but have no info about the training process, something like first create empty model using the purged dataset then transfer learning using the empty model with the same purged dataset? Also, how about the parameter setting to get this random.prior.new model? I would very much appreciate it if you could provide more details for it!
Hello,
I had some questions regarding setting up my own project using the Reinforcement_Learning notebook:
I hope these questions are clear and not too much of a problem!
Thank you for your time and warm regards, Wout Van Eynde