MolecularAI / REINVENT4

AI molecular design tool for de novo design, scaffold hopping, R-group replacement, linker design and molecule optimization.
Apache License 2.0
315 stars 76 forks source link

Reinvent Reinforcement Learning Conventions #122

Closed aezexa closed 3 weeks ago

aezexa commented 1 month ago

Hi, I hope this message finds you well.

Firstly, I would like to extend my gratitude for your invaluable contributions to the Reinvent project. I am currently working on integrating Reinvent with a Reinforcement Learning framework and have a few technical inquiries that I hope you can assist me with.

  1. MDP Integration in Reinforcement Learning: In the context of a Markov Decision Process (MDP) for Reinforcement Learning, could you please elaborate on how the agent (actor) navigates the environment? Specifically, how are the States and Actions defined within this molecular framework?
  2. SMILES Representation Analysis: What is the maximum length of a SMILES representation that can be encountered? Additionally, could you provide information on the total number of unique SMILES representations available? This information is very helpful for performing time and space complexity analysis.

Thank you in advance for your assistance

halx commented 1 month ago

Hi,

many thanks for your interest in REINVENT and welcome to the community!

As for 1. I refer you to eqs 5 and 6 in the paper which shows how the molecules's score is combined with the NLLs and how the loss is computed. I am not sure how useful it is to define "state" and "action" in this context. What happens is that the score informs the algorithm how to modify the current chemical space distribution (network) such that the next sampling steps results, ideally, in molecules that are more likely to score higher.

The length of SMILES strings will be determined in preprocess the source data and are based on chemistry considerations. I am not sure what you mean with "total number of unique SMILES representations available".

Cheers, Hannes.