crest-lab / crest

CREST - A program for the automated exploration of low-energy molecular chemical space.
https://crest-lab.github.io/crest-docs/
GNU Lesser General Public License v3.0
198 stars 42 forks source link

Problems with energy threshold between conformers #25

Closed fpabstDA closed 2 years ago

fpabstDA commented 3 years ago

I am trying to perform a conformer search on octane. The resulting energy differences between some conformers are very small (see below) and visual inspection shows that these conformers are the same. Using the -ethr option gives exactly the same result irrespective of the chosen value. Here is the first part of the .energies file: 1 0.000 2 0.544 3 0.556 4 0.558 5 0.564 6 1.024 7 1.024 8 1.024 9 1.025

I am using crest version 2.10.2 and observed the same phenomenon also for other molecules. Thanks, Florian

pprcht commented 3 years ago

Having these false-positive cases (especially for large ensembles of very flexible molecules) is not uncommon and expected to a certain degree. The reason for this is that the program is working with -mostly empirical- thresholds for the conformer sorting, namely the energy, RMSD and rotational constants. We cannot assume that the same thresholds work equally well for all chemical systems across a wide variety of structures. In general the thresholds were adjusted to work well in the regime of molecules up to 100 or 150 atoms, but there is no definite security that they will work.

In the octane case you mention energy differences can become very small because the conformational space is extremely large for linear alkanes and different conformers are often quite similar. With those small dE the sorting is basically unaffected by -ethr. Making -ethr larger will change nothing because dE is already much smaller than that. Making -ethr much smaller (or even setting it to something close to zero) would have the opposite effect and produce even more false-positives. The reason for the wrong sorting here is neither -ethr, nor the RMSD, but most likely the rotational constant (-bthr).

We made some improvements to the sorting in the 2.11 development version opposed to 2.10.2, but the false-positive cases will occasionally happen still. We also tried to experiment with some fourth criterium for the sorting, but so far have been unsuccessful in finding an appropriate one. If you do not need the entire ensemble but rather are interested in a selection of structurally divers low energy conformers you could try the --cluster <number of wanted clusters> option in crest 2.11. This performs a clustering of the final ensemble after the conformational search based on dihedral angles as structural descriptors and will provide the most divers structures.