MolSSI / QCEngine

Quantum chemistry program executor and IO standardizer (QCSchema).
https://molssi.github.io/QCEngine/
BSD 3-Clause "New" or "Revised" License
163 stars 79 forks source link

Implement support for DFT-D4 #291

Closed awvwgk closed 3 years ago

awvwgk commented 3 years ago

Description

Note: The Python API must be explicitly enabled when building dftd4 or can exist separately from a dftd4 installation. The harness can therefore fail to use dftd4 even if a dftd4 installation is present, in this case it could have a fallback for a FIFO calculator (xyz + command line input, JSON dump output), but this is work for a different PR.

Changelog description

Implement support for DFT-D4

Status

codecov[bot] commented 3 years ago

Codecov Report

Merging #291 (59271ad) into master (7ac0f54) will increase coverage by 0.90%. The diff coverage is 96.03%.

loriab commented 3 years ago

Note: The Python API must be explicitly enabled when building dftd4 or can exist separately from a dftd4 installation. The harness can therefore fail to use dftd4 even if a dftd4 installation is present, in this case it could have a fallback for a FIFO calculator (xyz + command line input, JSON dump output), but this is work for a different PR.

Can def found look for both which_import(dftd4) and then check separately for the API component or the external API package before declaring the program found? similar to https://github.com/MolSSI/QCEngine/blob/master/qcengine/programs/nwchem/runner.py#L55

loriab commented 3 years ago

don't worry about nwchem::test_error_correction -- it's ~20% flaky

awvwgk commented 3 years ago

Can def found look for both which_import(dftd4) and then check separately for the API component or the external API package before declaring the program found?

This should be the way to go, for now I want to just flesh out the Python API of dftd4. I had a brief look into the dftd3 FIFO harness which is incredible more complex than using a Python API, therefore this is nothing I want to tackle now.

I would be more interesting if the proposed interface is optimal for your purpose. I'm using the keywords to specify the damping parameters or the method name that should be looked up in the internal database. For dftd3 is handled a bit differently, what is there preference here?

loriab commented 3 years ago

I would be more interesting if the proposed interface is optimal for your purpose. I'm using the keywords to specify the damping parameters or the method name that should be looked up in the internal database. For dftd3 is handled a bit differently, what is there preference here?

The Python API alone will be great, thank you. No need for file + cmdline route -- that was just the only way circa 2012.

If it's all the same to you, would you adapt the fields toward https://github.com/psi4/psi4/blob/master/psi4/driver/procrouting/empirical_dispersion.py#L193-L209 ? Main changes are:

d3/d4 don't fit too well into the AtomicInput layout, unfortunately, but the above would be helpful in integrating into psi if it's not too much trouble.

awvwgk commented 3 years ago

Sounds good to me, DFT-D4 will provide this interface in the Python API for running QCSchema input. The only particularity, I don't allow to overwrite individual damping parameters, so either a method name or all parameters are required.

loriab commented 3 years ago

The only particularity, I don't allow to overwrite individual damping parameters, so either a method name or all parameters are required.

That's no problem for psi, as once user input passes through the resolver, a full set of parameters is always provided.

Thanks for the changes.

awvwgk commented 3 years ago

DFT-D4 3.1.0 is rolled out to conda-forge, just another hour until it propagates through the CDN and we can use it here to setup the testing environment.

awvwgk commented 3 years ago

What is the current status of this patch? It has been stale for a while now, is there anything I can do to help move this forward?

loriab commented 3 years ago

What is the current status of this patch? It has been stale for a while now, is there anything I can do to help move this forward?

I think it's great and good to merge. I'm sorry for the delay; I was hoping to get https://github.com/awvwgk/QCEngine/pull/63 merged into this beforehand, but then I got a cycle behind you on dftd4 revisions and didn't revisit. Shall we get this merged sooner, and then I'll propose updates once I've rechecked the abovementioned harness PR and https://github.com/psi4/psi4/pull/2142 ?

lgtm-com[bot] commented 3 years ago

This pull request introduces 1 alert when merging 8fdefe5ecbd6ca9036d3e69dc52ad320ad2e8016 into 43639f21ac3dd29d7984ffca3ae12d79733cc99b - view on LGTM.com

new alerts:

lgtm-com[bot] commented 3 years ago

This pull request introduces 1 alert when merging e805d1b7c72b0cb59d58dea9598f6b81e8e8d55e into 43639f21ac3dd29d7984ffca3ae12d79733cc99b - view on LGTM.com

new alerts:

awvwgk commented 3 years ago

@loriab, sorry, I didn't see the PR against my fork, I usually don't watch my own forks because I have bots syncing them with upstream via PRs, so 99% of the activity there is usually noise (or myself).

I had a quick look over your changes and added a few adjustments to the parameter reading. I also wonder why qcng is calling it d4bj rather than going with the actual name which is d4?

awvwgk commented 3 years ago

There are a few tests from test_dftd3 now failing, but I guess those are the added tests for dftd4. I think we should come up with a clearer naming here, otherwise this is just confusing.

Any idea why those fail? Did I break something upstream already?

lgtm-com[bot] commented 3 years ago

This pull request introduces 1 alert when merging f83605a49483e01be222eb3f8b80b5d981279b99 into 43639f21ac3dd29d7984ffca3ae12d79733cc99b - view on LGTM.com

new alerts:

awvwgk commented 3 years ago

I might have to look into the damping parameter generation again, we might be propagating a bad API decision from the C or Python layer up into qcng here (see https://github.com/dftd4/dftd4/issues/98). It might be worth to make both method and param_tweaks input an error at the Python level in dftd4, catch in the qcschema runner in dftd4 and send it back as ComputeError to qcng at some point until dftd4 is actually able to handle this case (load method first, overwrite parameters one by one).

loriab commented 3 years ago

Was renaming the reference values like https://github.com/MolSSI/QCEngine/pull/291/commits/89f467670c5e39acd561de1f90299d430f949c82#diff-c24ed775befa47b0b9de0da35810c7a47d5a6c5aa7c35b3e892a8b12abb09de6L78 solely to get the tests to work? Or do you very much not want the current definition of D4 to appear under the D4(BJ) label? I ask because I think that renaming of the refs is hiding another problem. Right now without damping popped, everything is getting caught in https://github.com/MolSSI/QCEngine/pull/291/files#diff-db09fb462ae1f6675bd709a50e8180a58d94038aedc9331b2f71d47e8b0c9d93R90 . Generally, I try to make the extras["qcvars"] as specific as possible, so no matter if model.method is b3lyp-d4 or b3lyp-d4(bj), the qcvar is the latter.

I can fix all this up. But I'd like to (1) not have damping in the params list https://github.com/MolSSI/QCEngine/pull/291/files#diff-d3e3a63b874003f6347c470b383ecfc60206b7c05b53f03a03f9e0ca04f081b3R806 since it's already specified at L798 and (2) have the qcvar label be D4(BJ) for reliability rather than whatever's in model.method. Is either change unsavory to you? Alternately on (2), we can have both D4(BJ) and D4 stored. (But only for D4=D4(BJ) among aliases, as I don't want to declare D=D2 https://github.com/MolSSI/QCEngine/blob/master/qcengine/programs/empirical_dispersion_resources.py#L19) :-)

awvwgk commented 3 years ago

Naming the dashlevel D4(BJ) is not completely accurate actually, we defined D4 as alias for D4(BJ, EEQ)-ATM in the original DFT-D4 publication, which was one of four potential methods we tested in the course of that work (D4(BJ, GFN2-xTB)-ATM, D4(BJ, GFN2-xTB)-MBD and D4(BJ, EEQ)-MBD were the other three). Therefore, I'm not completely happy with D4(BJ) because it misses the actual important information on the dispersion model.

loriab commented 3 years ago

Naming the dashlevel D4(BJ) is not completely accurate actually, we defined D4 as alias for D4(BJ, EEQ)-ATM in the original DFT-D4 publication, which was one of four potential methods we tested in the course of that work (D4(BJ, GFN2-xTB)-ATM, D4(BJ, GFN2-xTB)-MBD and D4(BJ, EEQ)-MBD were the other three). Therefore, I'm not completely happy with D4(BJ) because it misses the actual important information on the dispersion model.

Makes sense. Not recording the 3-body aspect was worrying me a bit, too. See what you think of https://github.com/awvwgk/QCEngine/pull/65, please. It gives me a clean test suite for qcng and for psi4 https://github.com/psi4/psi4/pull/2142.

Also feel free to rebase to incorporate #297 test suite fixes.

loriab commented 3 years ago

Thanks for the merge. I've push a commit directly to try to fix the d4 CI.

loriab commented 3 years ago

FYI, upon import psi4, psi4 processes the QCEngine empirical_dispersion_parameters file and thus now the parameters file from dftd4-python if present. That can crash if this line allowing DOIs isn't present, as it is now in psi4 master. I mention this in case someone has an older psi4 in the presence of dftd4-python and raises the issue with you instead of me, the guilty party.

awvwgk commented 3 years ago

Can we make the setup of the parameters for DFT-D4 lazy? Like just populate the fields if somebody actually asks for them when calling DFTD4 or constructing damping parameters? This would greatly reduce the probability that somebody accidentally runs into this issue. (my bad, that wasn't actually the problem)

Maybe this will not trip off the check in Psi4? I have to admit it's a bit hacky.

diff --git a/qcengine/programs/empirical_dispersion_resources.py b/qcengine/programs/empirical_dispersion_resources.py
index f89bfd7..adbe4f1 100644
--- a/qcengine/programs/empirical_dispersion_resources.py
+++ b/qcengine/programs/empirical_dispersion_resources.py
@@ -866,7 +866,7 @@ def _get_d4bj_definitions() -> dict:
             citation = params.pop("doi", None)
             definitions[method] = dict(
                 params=params,
-                citation=citation,
+                citation="    " + citation + "\n",
             )
         except KeyError:
             continue
loriab commented 3 years ago

Maybe this will not trip off the check in Psi4? I have to admit it's a bit hacky.

Yes, that works nicely, thanks.

awvwgk commented 3 years ago

Thanks a lot for your help. It took a long way to finally make DFT-D4 available here, but I think the dftd4 project has profited a lot from this collaboration.