A question regarding the modeling generalization ability

MolecularAI / aizynthfinder

A tool for retrosynthetic planning

https://molecularai.github.io/aizynthfinder/

MIT License

580 stars 134 forks source link

A question regarding the modeling generalization ability #122

Closed philipyang1 closed 1 year ago

philipyang1 commented 1 year ago

Hello, i have a question regaridng this expansion network model's generalization ability. Very Thanks ;-)

I notice the input of the network model is the trainning product's fingerprint and the output is the trainning's retro-reaction template. What if we try to predict a molecule that we have never seen? is this model working in this case? is this model based on the theroy that Compounds with similar Morgan fingerprints may have similar chemical properties, such as toxicity, water solubility, or chemical reactivity? is there any paper reference supports this? Very Thanks.

Philip Yang

philipyang1 commented 1 year ago

Hello, @SGenheden , Looking forward to hearing from you. I notice the article "Planning chemical syntheses with deep neural networks and symbolic AI" (https://www.nature.com/articles/nature25978). Based on this article, does the expansion network model supports to predict a compound that this neural network has never seen? Very Thanks ;-)

SGenheden commented 1 year ago

Yes, the idea is that neural network should be able to generalize to never seen compounds. The template-based model is basically working as a mapping from a molecule represented as a fingerprint to applicable templates, or templates that have a great chance of being applicable. And with applicable I mean that the product part of the template is a substructure match of the query molecule. Hope this helps.

philipyang1 commented 1 year ago

@SGenheden ,thanks for your answer. Now, it's clear for me. In this case, if we train the expansion network model with more reaction templates if we could get, we would get more better prediction model which its templates could have a great chance of being applicable for unseen compounds ;-)