openforcefield / openff-qcsubmit

Automated tools for submitting molecules to QCFractal
https://openff-qcsubmit.readthedocs.io/en/latest/index.html
MIT License
26 stars 4 forks source link

Add GridOptimization Colllection support #155

Closed dotsdl closed 3 years ago

dotsdl commented 3 years ago

@chapincavender is considering use of a GridOptimization Collection for amino acid QM treatment. This Collection currently does not have support in QCSubmit.

dotsdl commented 3 years ago

@jthorton and I met today to determine necessary components for this support. We concluded:

  1. GridOptimizations are very similar to Torsiondrives, in that they are service-based, generating new OptimizationRecords on the server as they iterate forward.
  2. We can use the existing QCSubmit TorsiondriveDataset as a template for building out the GridOptimizationDataset.
  3. Some new components we will need in QCSubmit include:
    • need results classes, similar to Torsiondrives
    • need support for distances (bonds, in our case), angles
  4. Some differences from Torsiondrives:
    • no wavefront propagation
    • only a single initial molecule allowed

The items in (4) may make GridOptimization of limited utility for @chapincavender in performing scans for dipeptides.

chapincavender commented 3 years ago

Thanks @dotsdl and @jthorton for moving forward with this! I think I'm not being very clear about what I want to do, so apologies from my end. As an alternative to doing a 2-D TorsionDrive with an expensive QM method, I want to explore:

1) Doing a 2-D TorsionDrive with a fast and less accurate method - a force field, a machine learning potential like ANI, or a semiempirical method like XTB 2) Taking the geometries of the lowest-energy structures at each grid point as input for a constrained geometry optimization with the expensive QM method.

I think this two-step approach resolves the issues in point (4) above because it includes wavefront propagation and multiple initial conformers in the 2-D TorsionDrive in the first step, albeit with the less accurate level of theory. I was using the term "grid optimization" to describe this two-step approach. I see now that QCArchive has a GridOptimization record that is constructed from a chain of optimizations using the previous step to seed the current step, so I think I misled you by applying that term incorrectly. Sorry for the misunderstanding!

I don't anticipate needing QCArchive's GridOptimization records in the near future. If the 2-D TorsionDrive in step 1 of the two-step approach fails with the fast method, then a GridOptimization dataset with either the fast or expensive method could replace the 2-D TorsionDrive in step 1. But the 2-D TorsionDrive is the preferred option, and I will move forward with that for now.

I think that I could accomplish the two-step approach without any additional infrastructure from QCSubmit by doing two sequential submissions - one submission for the 2-D TorsionDrive for all molecules, then a second submission for the constrained optimizations for all molecules. However, it would be nice to be able to start the constrained optimizations for molecules whose TorsionDrives are completed before the TorsionDrives are completed for all molecules. In the current infrastructure, my understanding is that I would have to iteratively update the second submission with new molecules as their TorsionDrives are completed from the first submission. As an alternative, QCSubmit could support the two-step approach in a single submission that submits a single record to QCArchive. The record metadata would include the results of the TorsionDrive in step 1 (i.e. grid points, lowest-energy geometries, lowest energy with fast method) and then each grid point would also have the results of the constrained optimizations with the expensive method in step 2.

This latter approach is almost certainly beyond the scope of this issue due to my misunderstanding of the term "grid optimization", so perhaps we should close this. If we decide to move forward with the approach in the previous paragraph, I can open another issue. Probably we will need to involve QCArchive as well to support the new record type.

dotsdl commented 3 years ago

Thank you @chapincavender, we'll close this for now then. Feel free to re-open if you find that this is something you will need, and then we'll pursue it.