topazape / LSTM_Chem

Implementation of the paper - Generative Recurrent Networks for De Novo Drug Design.
The Unlicense
116 stars 55 forks source link

Fragment growing' implemented in code? #12

Open shree970 opened 3 years ago

shree970 commented 3 years ago

Fantastic implementation of paper, although they have one more method of fine-tuning called as 'fragment growing', where if you give one fragment as SMILE, it will generate SMILES around that fragment. Is there any direction that you can point me to?

topazape commented 3 years ago

@shree970

Thank you for your comment.

Yes, Fragment Growing, in the paper, is implemented in code. For example, if you want to get new 100 molecules grew from benzamidine c1c(C(=N)N)cccc1, call lstm_chem.generator.sample(num=100, start='Gc1c(C(=N)N)cccc1').

However, be careful, this benzamidine SMILES sequence is not canonical. The SMILES is changed to elongate the molecule in a specific direction. That is, N=C(N)c1ccccc1 is canonical. N=C(N)c1ccccc1CCC is valid, but that is wrong direction. So I changed SMILES to valid and correct direction. (e.g. c1c(C(=N)N)cccc1CCC)