choderalab / espaloma

Extensible Surrogate Potential of Ab initio Learned and Optimized by Message-passing Algorithm 🍹https://arxiv.org/abs/2010.01196
https://docs.espaloma.org/en/latest/
MIT License
212 stars 23 forks source link

Qcarchive update #187

Open chrisiacovella opened 1 year ago

chrisiacovella commented 1 year ago

This updates qcarchive_utils.py to be compatible with v0.5 of qcportal. Relates to issue #185

This code reproduces the same behavior as the prior implementation.

mikemhenry commented 1 year ago

Awesome! This is good timing with #186

Once we get both in, we should cut a new release.

chrisiacovella commented 1 year ago

This PR implements the logic in effectively the same way as the old code, which is on a per-record basis (i.e., a function operates on a single record name). The new version of qcportal has iterators on records, which are substantially faster (like orders of magnitude, due to prefetching and caching). The next commit will include functions that operate on the entire record sets to avoid slow performance.

codecov-commenter commented 1 year ago

Codecov Report

:exclamation: No coverage uploaded for pull request base (main@2e61215). Click here to learn what that means. The diff coverage is n/a.

Additional details and impacted files
mikemhenry commented 1 year ago

This line will need to get changed @chrisiacovella https://github.com/choderalab/espaloma/pull/187/files#diff-ba5d22563299549a389183418fe5786b83275382be592bf1ed06fae673b7d086L23 Sorry about that!

mikemhenry commented 1 year ago

We can probably remove that line since https://github.com/choderalab/espaloma/pull/187/files#diff-ba5d22563299549a389183418fe5786b83275382be592bf1ed06fae673b7d086R33 will pull in what we need (I think, I am not sure what the "main" qcarchive package is)

chrisiacovella commented 1 year ago

This line will need to get changed @chrisiacovella https://github.com/choderalab/espaloma/pull/187/files#diff-ba5d22563299549a389183418fe5786b83275382be592bf1ed06fae673b7d086L23 Sorry about that!

Good catch.

chrisiacovella commented 1 year ago

From @jchodera

There are apparently some additional issues with the object model such that datasets beyond OptimizationDataset are not supported

Yes. the get_graph function in the initial code was only setup to work with the OptimizationDataset. I think it would be straight forward to support the SinglepointDataset objects and put in some checking in get_graph and get_graphs to give a descriptive failure message if a different set is tried.

kntkb commented 1 year ago

@chrisiacovella I remember when fetching the results from the SinglepointDataset that uses b3lyp-d3bj (openff default level of theory), you needed to combine the results from the DFT and the dispersion correction terms. This is not the case for OptimizationDataset and TorsionDriveDataset. I wonder if this behavior is the same for the latest QCArchive server and qcprotal.

chrisiacovella commented 1 year ago

@chrisiacovella I remember when fetching the results from the SinglepointDataset that uses b3lyp-d3bj (openff default level of theory), you needed to combine the results from the DFT and the dispersion correction terms. This is not the case for OptimizationDataset and TorsionDriveDataset. I wonder if this behavior is the same for the latest QCArchive server and qcprotal.

@kntkb This is something I started looking at when switching from the old to the new version, but I can't seem to find my notes; for some reason I think one of the specifications does include the sum, but don't quote me on that. I'm currently trying to figure that out right now actually.