duartegroup / autodE

automated reaction profile generation
https://duartegroup.github.io/autodE/
MIT License
171 stars 52 forks source link

Accelerating the Removal of Conformers based on RMSD Thresholds #210

Closed j-westphaeling closed 1 year ago

j-westphaeling commented 1 year ago

Hi,

I just wondered if there is a way to accelerate the removal of non-unique conformers (based on RMSD thresholds) if very large sets of conformers are requested. Specifically, I am requesting the generation of 100 000 conformers for decylbromid. Until the set of unique conformers is found, more than 4 hours passes. However, once the set has been determined, the optimization with xtb finishes within minutes.

Additionally, I've noticed that the pruning based on RMSD for longer alkyl chains (e.g. C16 or C30) takes more than 72 hours, again with 100 000 conformers requested.

Is it possible to parallelize the calculation of the RMSDs on the cores available (if parallelization of these calculations is not already the default)? Or did I overlook some configuration that would accelerate this procedure?

Has anyone a suggestion for this question/problem? I would very much appreciate help!

t-young31 commented 1 year ago

Hi @j-westphaeling – thanks for the question.

tldr; CREST maybe is the right tool for this problem.

I don't think you've overlooked any configuration. Conformer generation with RDKit is pretty fast but with 100k conformers is going to take a bit of time, but at least scales linearly with number. The RMSD comparison is basically a N^2 for loop in Python which is -- as you've found -- going to be slow for O(100k) conformers. In my opinion there are better tools out there for the problem i.e. CREST from the Gimme lab which might be a good choice if you're using XTB already. It would be really nice to implement all the possibly slow steps in a lower-level language than Python for speed but that's unlikely to be done any time soon! Parallelisation would be somewhat straightforward to implement, but isn't the most optimal way forward for this problem.

j-westphaeling commented 1 year ago

Thanks a lot @t-young31! I will look into this CREST. I am still new to the field of computational chemistry and might over-engineer some issues that arise within my project. But it’s always fun and interesting to learn new things.

I am eager to see how CREST and autodE can be used together, i.e. generating the conformers with CREST and feeding them to autodE to calculate the reaction profiles I am interested in. It doesn't seem to work by the push of a single button. I guess that will be a next entry on my To-Do List in the upcoming weeks.

t-young31 commented 1 year ago

Closing as I think this is resolved(?) Please reopen if it's not 😄