Open lilyminium opened 7 months ago
Hi @lilyminium good spot, I see that we try to propagate it to qcfractal here when making the OptimizationSpecification
but it seems to get lost along the way. With some digging in qcfractal I found that the OptimizationSpecification
forces the use of the default single-point protocol which is removing our wavefunction settings. My guess is to stop the wavefunction from being saved for an entire trajectory but we could have this restriction removed if we pass a protocol which only saves the final result from an optimization. cc @bennybp @dotsdl
QCElemental right now doesn't support protocols for the gradient calculations. The structure in QCElemental is a bit different than QCFractal: QCFractal reuses a full QCSpecification object, but QCElemental has its own, which unfortunately doesn't have protocols: https://github.com/MolSSI/QCElemental/blob/e942b810f1681b7d38209d0fed55b49954e6e4b5/qcelemental/models/procedures.py#L56
So I fix the protocols to be the default to at least maintain consistency in the DB (I should probably add a comment there). This was a huge pain during the migration from the legacy version to the current version.
This is part of a QCElemental refactor which has been a long time coming, unfortunately.
Since I'm taking over maintenance of QCSubmit, I took some time to wrap my head around this. Here are notes, maybe just for myself, though I'd really appreciate correction if I'm misunderstanding something:
dataset.submit
, the resulting QCF dataset has its qc_specification
's protocols
attribute overwritten..wavefunction
attribute populated by a ton of numbers.driver
value for optimizations has changed from gradient
to deferred
. This change in wording makes me think that there are plans for datasets that say "here is a QC spec, but DEFER evaluating it until the gradient evaluations are done". I didn't dig far into this, but I'm optimistic that this is the meaning. So I think the right solution is that the force_qcspec
validator should either warn or raise an error when a protocol is overwritten. Does that sound good to you @bennybp? If not, we can have QCSubmit raise an error in that situation instead.
Largely correct, with a few more clarifications:
* So it IS possible to get wavefunctions for these evaluations currently, but we'd need to submit a new BasicDataset using the final molecules from the existing optimization dataset's results. (and now that I look, @lilyminium has done exactly that [here](https://github.com/openforcefield/qca-dataset-submission/pull/341))
I vaguely plan (with no timetable) to have this built in, since it will be very common: https://github.com/MolSSI/QCFractal/issues/748
Another option is to overhaul the qcelemental models to take in two specifications, but that would be a bit more disruptive.
* In new QCF, the `driver` value for optimizations has changed from `gradient` to `deferred`. This change in wording makes me think that there are plans for datasets that say "here is a QC spec, but DEFER evaluating it until the gradient evaluations are done". I didn't dig far into this, but I'm optimistic that this is the meaning.
I wouldn't be so optimistic :)
That value pertains only to the driver (which is typically energy, gradient, hessian). There's an awkwardness when we allow that to be set by the user for an optimization, because it is largely ignored - whether an energy, gradient, or hessian is run is controlled by the optimization software, not the user. So what would it mean to submit an optimization with driver = energy
when geometric
is just going to do gradients anyway. So the (admittedly hacky) idea was to make this deferred
value for the enum - the overall type of calculation is being deferred to the optimization software.
* @bennybp's post just above this one reiterates that it wasn't and isn't possible to request wavefunctions in optimizations. I'm not sure if "This is part of a QCElemental refactor which has been a long time coming, unfortunately." means "unfortunately it will continue NOT being possible after a refactor that has/will happen" or "it will be possible after the refactor, which unfortunately will take a very long time". Could you clarify?
It's been on the table for a good while now, along with some qcelemental/qcschema splitting ideas. They just end up taking a back seat to other pressing issues.
So I think the right solution is that the
force_qcspec
validator should either warn or raise an error when a protocol is overwritten. Does that sound good to you @bennybp? If not, we can have QCSubmit raise an error in that situation instead.
I agree. The current behavior is technically correct but hidden. So raising an exception is better than silently changing their input.
Awesome. Thanks for the thorough response. This makes a lot of sense, and I think it'd be great to put an exception in QCFractal when protocols are overwritten :-)
When creating an
OptimizationDataset
, I would expectstore_wavefunction
to get propagated into the generated QCSpecification. However, I don't think this is currently happening -- should it be?Code
Looking at the
qc_specifications
they do seem to be there:It also works for a BasicDataset:
(cc https://github.com/openforcefield/qca-dataset-submission/pull/335#issuecomment-1835378295)
Version: 0.50.1