Closed dotsdl closed 3 years ago
@jthorton and I met today to determine necessary components for this support. We concluded:
GridOptimization
s are very similar to Torsiondrives
, in that they are service-based, generating new OptimizationRecord
s on the server as they iterate forward.TorsiondriveDataset
as a template for building out the GridOptimizationDataset
.Torsiondrives
The items in (4) may make GridOptimization
of limited utility for @chapincavender in performing scans for dipeptides.
Thanks @dotsdl and @jthorton for moving forward with this! I think I'm not being very clear about what I want to do, so apologies from my end. As an alternative to doing a 2-D TorsionDrive with an expensive QM method, I want to explore:
1) Doing a 2-D TorsionDrive with a fast and less accurate method - a force field, a machine learning potential like ANI, or a semiempirical method like XTB 2) Taking the geometries of the lowest-energy structures at each grid point as input for a constrained geometry optimization with the expensive QM method.
I think this two-step approach resolves the issues in point (4) above because it includes wavefront propagation and multiple initial conformers in the 2-D TorsionDrive in the first step, albeit with the less accurate level of theory. I was using the term "grid optimization" to describe this two-step approach. I see now that QCArchive has a GridOptimization
record that is constructed from a chain of optimizations using the previous step to seed the current step, so I think I misled you by applying that term incorrectly. Sorry for the misunderstanding!
I don't anticipate needing QCArchive's GridOptimization
records in the near future. If the 2-D TorsionDrive in step 1 of the two-step approach fails with the fast method, then a GridOptimization
dataset with either the fast or expensive method could replace the 2-D TorsionDrive in step 1. But the 2-D TorsionDrive is the preferred option, and I will move forward with that for now.
I think that I could accomplish the two-step approach without any additional infrastructure from QCSubmit by doing two sequential submissions - one submission for the 2-D TorsionDrive for all molecules, then a second submission for the constrained optimizations for all molecules. However, it would be nice to be able to start the constrained optimizations for molecules whose TorsionDrives are completed before the TorsionDrives are completed for all molecules. In the current infrastructure, my understanding is that I would have to iteratively update the second submission with new molecules as their TorsionDrives are completed from the first submission. As an alternative, QCSubmit could support the two-step approach in a single submission that submits a single record to QCArchive. The record metadata would include the results of the TorsionDrive in step 1 (i.e. grid points, lowest-energy geometries, lowest energy with fast method) and then each grid point would also have the results of the constrained optimizations with the expensive method in step 2.
This latter approach is almost certainly beyond the scope of this issue due to my misunderstanding of the term "grid optimization", so perhaps we should close this. If we decide to move forward with the approach in the previous paragraph, I can open another issue. Probably we will need to involve QCArchive as well to support the new record type.
Thank you @chapincavender, we'll close this for now then. Feel free to re-open if you find that this is something you will need, and then we'll pursue it.
@chapincavender is considering use of a
GridOptimization
Collection for amino acid QM treatment. This Collection currently does not have support in QCSubmit.