ispg-group / aiidalab-ispg

ATMOSPEC: ab initio UV/vis spectroscopy for everyone
MIT License
6 stars 4 forks source link

Improve structure generation from smiles using rdkit #1

Open danielhollas opened 2 years ago

danielhollas commented 2 years ago

Take a look at example 7 here: https://www.programcreek.com/python/example/110781/rdkit.Chem.AllChem.EmbedMolecule

Possibly an improvement to the SmilesWidget in aiidalab-widget-base. But need a good test set for this.

The conformational analysis performed by using the RDKit required the EmbededMultipleConfs function to generate the conformers. The number of conformers was set to be the cube of the number of rotational bonds (nr3), the preoptimization RMSD filtration threshold was 0.1 Å, and both options useExpTorsionAnglePrefs and useBasicKnowledge were activated. The generated conformers were optimized by using the MMFF94s FF, a variation of the MMFF94 FF available on the RDKit that performs better than the default version when planar N hybridizations are present in the molecules. A post-optimization filtration process was performed by means of energy cutoff (Ecutoff = 10 kcal/mol for unconstrained calculations and 20 kcal/mol for constrained calculations) and root-mean-square deviation (RMSD) cutoff (RMSDcutoff = 0.5 Å). All calculations executed with RDKit were performed using the 2021.03.1 version of the package.

danielhollas commented 2 years ago

This turned out to be rather annoying, looks like the methods in rdkit are rather unstable, or I am doing something wrong.

To reduce the scope of this, we will aim for stable results for a minimum test set of molecules defined in #4, in conjunction with #29 and #28 which will allow users to upload / remove individual conformers themselves in case automatic generation fails.

I will also try to be a good citizen and upstream general improvements to aiidalab-widgeta-base.

danielhollas commented 2 years ago

An approach to look into used in ternviz

https://github.com/whitead/ternviz/blob/2a34ba5be73d8aac2e192d00a4af4e5e365cbd36/ternviz/lib.py#L127

Also SMILES validation and canonicalization: https://github.com/whitead/ternviz/blob/2a34ba5be73d8aac2e192d00a4af4e5e365cbd36/ternviz/lib.py#L23

Code is under MIT license.

danielhollas commented 2 years ago

https://github.com/duartegroup/autodE

danielhollas commented 2 years ago

https://greglandrum.github.io/rdkit-blog/conformers/exploration/2021/01/31/looking-at-random-coordinate-embedding.html

danielhollas commented 1 year ago

https://pubs.acs.org/doi/10.1021/acs.jcim.2c00934

https://asteeves.github.io/blog/2015/01/12/conformations-in-rdkit/

danielhollas commented 1 year ago

https://corinwagen.github.io/public/blog/20221219_low_code_csearch.html