Closed moabe84 closed 10 months ago
If you are referring to rotamers instead of true conformers, you can always turn the GC off, entirely. This step will produce mostly rotamers.
If, however, you are talking about true conformers that are similar, this can of course happen. There is no guarantee that low-lying conformations are hugely different. In fact, often it is reasonable to assume many low-lying structures are similar, at least if they belong to the same energy landscape funnel.
Should you want to force a lower number of conformers you could loosen the comparison thresholds (i.e. increase the RMSD threshold in CREGEN with -rthr
, and/or increase the energy threshold with -ethr
, possibly also the rotational constant threshold with -bthr
).
Alternatively, a more drastic solution would be to stick the final conformational ensemble (crest_conformers.xyz
) into a PCA/k-Means clustering, which will group the most similar structures together. The criteria are a bit arbitrary with this, so I would only recommend it if you truly need to trim down the ensembles by a lot. We have an implementation that could be used with something like
crest struc.xyz --cregen crest_conformers.xyz --cluster 5
which tries to select 5 representative structures. But there are surely some python packages that could do such clustering as well, with appropriate workarounds.
Thank you very much. I really appreciate your time and help. They are all good and helpful suggestions. Regarding using the "_crest struc.xyz --cregen crestconformers.xyz --cluster 5" option, does it first cluster all the conformations (in the crest_conformers.xyz file) into 5 groups, based on the RMSD, and then select the one that has the lowest energy from each group?
Mostafa
The clustering is not based on the RMSD, but yes, the lowest energy structure for each group is returned in the end.
Many thanks Philipp.
One more question: I'm trying to apply the CREGEN ensemble clustering method to a traj file obtained from MTD simulations. But for some reason, it seems, it only recognizes the first conformer and ignores the rest. I need to say that it works perfectly for the traj from the CREST conf. search calculations.
Here are the part of the MTD traj file (only 5 conformers), ref. structure, and the output file.
mtd_confs.xyz.txt
ref.xyz.txt
output.txt
Thanks, Mostafa
It's probably the energy window. Try increasing it with --ewin
. The default is 6 kcal/mol
Thanks Philipp. You're absolutely right. That was the energy window. In fact, this helped me to realize that the problem was the first conformer in the mtd traj file which was exactly the same as the ref. structure. I removed that and now it works with the default 6 kcal/mol energy window. Great! You have been most helpful. Many thanks.
Hi Philipp, I'd like to ask you a question. I was analyzing the final generated conformations for a structure and realized that there are a lot of similar conformers according to the RMSD analysis. I'm wondering if there is any way to minimize the number of similar conformations perhaps through the keywords in the "Structure Crossing (GC)" calculations:
conformer energy window /kcal : 5.00 CN per atom difference cut-off : 0.3000 RMSD threshold : 0.2500 max. # of generated structures : 2500
It would be greatly appreciated if you could have comments on this matter. Thank you very much.
Mostafa