jkwang93 / MCMG

MCMG_V1
MIT License
69 stars 25 forks source link

one small bug,I hope it will be helpful #8

Closed Turningl closed 1 year ago

Turningl commented 1 year ago

Hello jk: when I test your data_structs.py code, I find a small bug in it, as follow: the original smiles: COc1cc(N2C(N)=C(C#N)C(c3ccccc3)C3=C2CC(C)(C)CC3=O)cc(OC)c1OC the transformer matrix: [51. 52. 54. 16. 20. 44. 6. 44. 44. 2. 19. 7. 16. 2. 19. 3. 15. 16. 2. 16. 0. 19. 3. 16. 2. 44. 8. 44. 44. 44. 44. 44. 8. 3. 16. 8. 15. 16. 7. 16. 16. 2. 16. 3. 2. 16. 3. 16. 16. 8. 15. 20. 3. 44. 44. 2. 20. 16. 3. 44. 6. 20. 16. 48.] the generative smiles: not_DBrD2high_QEDgood_SACOc1cc(N2C(N)=C(C#N)C(c3ccccc3)C3=C2CC(C)(C)CC3=O)cc(OC)c1OC the generative smiles include some boolean attributes, but the 'DRD2' attribute will be converted to 'DBrD2',so the decode funciton has some problem, so I make a little change in it. I hope it will be helpful! As I step through your code, I seem to realize that the deocde part in your 2_generator_Transformer.py is a very important part, and I really want to know if you decode the generated csv file without errors in this part?

jkwang93 commented 1 year ago

Thank you very much for your suggestion. We did not have this problem during the test, data_structs.py does require users to make changes according to their own needs. But the situation you encountered is indeed quite unexpected. We will confirm the bug and make adjustments later.

Turningl commented 1 year ago

thank you verr much for your reply. I would also like to ask you another question, I have to say that the effect of smiles gnerated by transformer model is good after I reproduce it, and that's where my problem lies. I know when you encode, you have programmed some boolean attributes (QED, SA, JNK3...) as 0 and 1, but the smiles generated by 2_generator_Transfomer.py the decode part don't seem to have these boolean attributes (QED, SA, JNK3...) , generated smiles are intact, why would these smiles not have any relevant attributes attached? I really want to know!

Turningl commented 1 year ago

emmm, I think I have found the answer myself. But antoher question is I find 4_train_agent_save_smile.py can't generative effectively molecular, I hope you can confirm the bug and make adjustments later

jkwang93 commented 1 year ago

There should be no problem with this part of the code, you can email me if you have any specific problems. jikewang {at} whu {dot} edu {dot} cn

Turningl commented 1 year ago

I have emailed to you, please check.

Turningl commented 1 year ago

this problem hasn't been solved, so I'm curious to know what your changes are yet?

bbjy commented 1 year ago

I meet the same problem. The generated molecules by 3_train_middle_model_dm.py have unexpected attribute tokens, e.g. "high_QEDgood_SACc1cc(-n2cc(C(F)(F)F)nn2)cc(OS(=O)(=O)NCC2CC2)c1F".

Do you have solved this problem? Thank you!