openforcefield / openff-benchmark

Comparison benchmarks between public force fields and Open Force Field Initiative force fields
MIT License
10 stars 2 forks source link

Include mirror-image checking in RMS calcs for conformer deduplication #57

Open j-wags opened 3 years ago

j-wags commented 3 years ago

Our existing conformer-deduplication algorithm doesn't catch mirror-image conformers. Mirror-image conformers are functionally/electronically identical to each other, but might be characterized by a symmetric subgroup rotating +N or -N degrees about a rotatable bond. In many cases these will have identical energies, but they aren't identified by our RMSD checking algorithm, even if we check for symmetry automorphs.

Since it seems likely that many of the conformers generated with an 0.5A RMSD cutoff will optimize to the same geometry during QM, and that might lower the quality of the analysis, we should: 1) Make ALL our conformer deduplication steps check for mirror-image conformers, and 2) add a workflow step for conformer deduplication AFTER QM optimization.

In non-stereogenic molecules, it should be sufficient to reflect X coordinates of each non-self conformer, and then redo the conformer RMSD checks. In stereogenic molecules, it is inappropriate (at least in most cases I can think of) to consider mirror images.

For a reference implementation, see https://github.com/chemalot/chemalot/blob/b1bcaa832afddd05046ff97e3877d954c80f7eca/src/com/genentech/chemistry/openEye/apps/SdfRMSDSphereExclusion.java#L285-L293