Closed alan-arnold closed 5 months ago
Hi Alan,
Thank you for looking into this. All the structures are optimized because it is unclear how large the energy change will be upon optimization. This depends on how far the M(T)D snapshot is from a local minimum.
That 1200 structures remain in the energy window seems to be either a printout or an actual error, but the number of structures refers usually to the structures in the crest_rotamers_X.xyz
files. I will check this.
The lowest-energy structures of each M(T)D run are collected and should occur at the end of the ensemble run.
This issue had no activity for 6 months. It will be closed in 1 week unless there is some new activity.
Me again I'm afraid :) After a few more QCG experiments following your changes in #183, which by design trades off a longer run-time for a more comprehensive search for minima in the energy hypersurface, I realized that I don't understand why it's necessary to optimize all of the large number of structures in crest_rotamers_0.xyz that arise from the MTD sampling, instead of only those within a reasonable energy window of the current lowest energy structure?
So, I took a close look at a few of the temporary files that are created during and after the MTD cycles.
This is the beginning if the output from a QCG solvation run for methane in a 100-waters cluster that had previously been grown with aISS:
crest CH4.xyz --T 24 --qcg h2o.xyz --gsolv --alpb h2o --mdlen 5 -keepdir > CH4_crest_5_qcg100_3.1.out &
It appears that the 1200 optimized structures for the initial cycle of 12 MTDs are stored in the temporary file .cre_0.xyz
I imported this file into Excel (don't ask !!) and sorted the 1200 energies, which range from -35.07945300 to -34.95850797 Eh
The lowest energy structure (-35.07945300) is the 2nd in the file - the one after the 1st structure marked "!input"
This value seems to be taken as the "reference state Etot" in the output above for this 1st MTD Iteration.
However, only 18 other structures (not 1200) have Etot within 6.00 kcal/mol of this lowest energy, so the last line in the output above, ie. "1200 structures remain within 6.00 kcal/mol window" does not appear to be correct. And a lot of cpu-time is spent optimising them!
The same conclusion can be reached for the subsequent MTD cycles which create the corresponding .cre_1.xyz, .cre_2.xyz etc., each with only 1000 structures this time (only 10MTDs):
.cre_1.xyz from MTD Iteration 2 has only 8 of 1000 structures within 6 kcal/mol of the reference state Etot -35.09724733, and .cre_2.xyz from MTD Iteration 3 has only 6 of 1000 structures within 6 kcal/mol of the reference state Etot -35.11145228 etc.
So, is the "1200 (or 1000) structures remain within" just a printout bug, or am I misunderstanding what's going on here? And where do these 18+8+6 + ... lowest-energy structures end up?