deepchem / deepchem

Democratizing Deep-Learning for Drug Discovery, Quantum Chemistry, Materials Science and Biology
https://deepchem.io/
MIT License
5.28k stars 1.64k forks source link

Mixture descriptors #658

Open mo-kazemi opened 6 years ago

mo-kazemi commented 6 years ago

Dear All,

The molecular featurization is done using ECFP or GraphConv or etc. This converts the single molecule to appropriate format for machine learning. Does deepchem have the capability to do featurization for mixture of molecules based on their molar percentages? An example of this implication can be found in following manuscript:

Gaudin, Theophile, Patricia Rotureau, and Guillaume Fayet. "Mixture descriptors toward the development of quantitative structure–property relationship models for the flash points of organic mixtures." Industrial & Engineering Chemistry Research 54.25 (2015): 6596-6604.

rbharath commented 6 years ago

This isn't currently supported, but it sounds like an interesting feature. I'd be open to a PR adding it as a new featurizer

mo-kazemi commented 6 years ago

Thank you rbharath.

rbharath commented 6 years ago

I'm labeling this as a research question since it looks like creating this implementation will require significant effort and isn't just a matter of porting in an existing deep learning architecture. I suspect that the implementation alone will not be publishable, but when combined with some deep learning work, may well be.

If you're interested in taking on this challenge, please leave a comment, and a member of the technical steering committee may be interested in mentoring you.

rbharath commented 6 years ago

Ok, did some more digging. I don't think this is a research question, but it a solid intermediate level contribution for a contributor with some background in chemistry. Contributions welcome!