crest-lab / crest

CREST - A program for the automated exploration of low-energy molecular chemical space.
https://crest-lab.github.io/crest-docs/
GNU Lesser General Public License v3.0
202 stars 42 forks source link

CREST with GFNFF and comparing with GFN2-xTB level. #21

Closed jungsdao closed 2 years ago

jungsdao commented 3 years ago

I'm dealing with conformers of radical species with long chain. I believe conformer sampling is very important issue only fraction of conformers would be accessible in certain temperature. If I run CREST with normal gfn2-xtb level, initial connectivity in geometry is not preserved especially for radical species (-uhf 1) I used gfnff option to avoid this problem, but I'm questioning the reliability of FF level theory in terms of energy output.

Since the output of GFNFF conformer ensemble file is not directly comparable using -compare option due to energy difference, I additionally run -screen for GFNFF ensemble. Subsequently, I tried comparing two ensembles: one from crest with gfn2-xtb and the other from gfnff. However, I'm confused while interpreting the result I got in this step. When two ensembles are compared energy seems similar but RMSD threshold with 0.125 Angs. results in only one pair as identical in geometry.

Please find attached ensemble files: cal.xyz -> original molecule (which is a radical) conf_gfnff.xyz -> ensemble from GFNFF conf_xtb.xyz -> ensemble from xTB rmsdmatch.dat , stdout.log -> output files

Question is: 1) Do I need to increase RMSD threshold when comparing large molecule like in this case? (Numatom = 94) Should default 0.124 A should be used or should it be larger depending on molecular size?

2) attached result seems to indicate that only 16 & 18 from each ensemble is identical with RMSD threshold 1.0 A. Is it correct interpretation? If so, doesn't it mean that GFNFF and xTB results are far different?

3) Then, can I rely on the conformer energy distribution obtained from GFNFF or additional -screen process at lease in xTB level is necessary?

My question might not well written, I'll further clarify if there's any confusing point. Many thanks in advance.

question.zip

pprcht commented 3 years ago

There are multiple problems if ensembles from different levels of theory are compared. One is that the energies are not comparable. The idea to re-rank one of the ensembles at the other level is correct, but keep in mind that -screen does an entire re-optimization of the geometries. However, in the -compare tool you must provide ensembles including conformers and rotamers (typically the crest_rotamers.xyz files) because you must ensure that the same rotamer for each conformer is compared. Otherwise this could lead to artificially high RMSDs. In your case it would therefore be better to use -mdopt instead of -screen to avoid a sorting of the file, together with the crest_rotamers.xyz file from GFN-FF. -compare does a sorting and identification of the correct rotamer automatically.

To breifly answer your other three questions:

  1. You can sligthly increase it (maybe to something like 0.25-0.3 or so), but there is no general rule how it should be for larger systems
  2. Working with thresholds always poses the problem of "false positive" results. The sorting is -by construction- only correct within the limitations of the thresholds. At large RMSD thresholds everything will be identified as identical at some point, which of course is wrong. 1.0 A is too much, keep it rather small.
  3. This is really system dependent. In some cases the GFN2 conformers are better, in others the GFN-FF conformers are better. For complicated cases you should re-optimize the best conformers at DFT level and then decide.