mosdef-hub / foyer

A package for atom-typing as well as applying and disseminating forcefields
https://foyer.mosdef.org
MIT License
119 stars 78 forks source link

Add Forcefield.combining_rule #433

Closed mattwthompson closed 3 years ago

mattwthompson commented 3 years ago

PR Summary:

Fixes #431 and implements #432

Below this line are old comments I think are resolved


Right now I can't get the @property working, even though I basically copied the existing name and version properties. A script like this should run, but doesn't work for me at the moment locally:

import foyer
oplsaa = foyer.Forcefield(name="oplsaa")
assert oplsaa.combining_rule == "geometric"

As discussed in #432, there are a few names that could be used; I have zero preference and will gladly update it to what the Foyer maintainers prefer.

I updated foyer.xml_writer._infer_lj_14scale to assume geometric mixing rules, just to get tests passing. This should probably be updated to take the combining rule as an argument, however.

Forcefield.apply() takes an argument combining_rule. This is IMHO an anti-pattern, I believe they are properties of force fields and not the typing or parametrization processes. For now I've coded in an exception for when each are defined but not matching; some more care should probably be taken to not break users' workflows. (Though I'd argue there are some cases in which the application of a force field is wrong and workflows should be broken.) The case of using Foyers' OPLS-AA and calling Forcefield.apply(..., combining_rule="geometric") should be fine, since the parameters match.

PR Checklist


codecov[bot] commented 3 years ago

Codecov Report

Merging #433 (b4afdff) into master (fc44f80) will increase coverage by 0.25%. The diff coverage is 85.36%.

:exclamation: Current head b4afdff differs from pull request most recent head c482f2a. Consider uploading reports for the commit c482f2a to get more accurate results

@@            Coverage Diff             @@
##           master     #433      +/-   ##
==========================================
+ Coverage   73.58%   73.84%   +0.25%     
==========================================
  Files          17       17              
  Lines        1802     1839      +37     
==========================================
+ Hits         1326     1358      +32     
- Misses        476      481       +5     
justinGilmer commented 3 years ago

For testing, what do you think is an amenable solution: should we have two different top files, for each supported combining rule? Or just test the one based on the combining rule specified in the XML?

mattwthompson commented 3 years ago

I think both cases should be tested, but the Lorentz-Berthelot case should probably not use OPLS-AA.

The XML writing failures earlier made me a little weary of how much processing is done of the final parametrized structure. This is effectively "backwards" of the normal flow, so some more consideration might be needed there and in similar cases.

mattwthompson commented 3 years ago

I'm baffled by the coverage report; it reports that the exception I added is not covered, but the code that raises is it covered. Function definitions _patch_parmed_adjusts and _parse_combining_rule are reported as not covered, yet their contents are covered.

mattwthompson commented 3 years ago

Okay, I've added tests for each case using one-off force field files. Reading the combining rule from the file and setting it on the force field does not work for some reason I can't figure out; I can set a debug around here and find that the data is being parsed correctly, but it's not set on the result. Maybe something is up with how the entry points are configured? I could please use some help here, it's essential behavior to getting the tests working. If that's resolved, I should be able to turn around tests and get this ready for review quickly.

justinGilmer commented 3 years ago

Quick q, does it work in the instance when it is not a list of forcefield files? Or does it not apply the combination rule in all instances?

justinGilmer commented 3 years ago

From my testing, something has to be going out of scope. This is a MWE that fails for all 3 top level properties.

import foyer
oplsaa = foyer.Forcefield(name="oplsaa")
print('mwe')
print(oplsaa.combining_rule)
print(oplsaa.version)
print(oplsaa.name)
assert oplsaa.combining_rule == "geometric"

name, version, and combining_rule seem to be None.

IF HOWEVER, you load the XML by using Forcefield(forcefield_files="path/to/xml") that does seem to work properly.

justinGilmer commented 3 years ago

Can also confirm that version, name, combining_rule all persist if you create a forcefield object using

import foyer
oplsaa = foyer.forcefields.load_OPLSAA()
print(oplsaa.combining_rule)
print(oplsaa.name)
print(oplsaa.version)
>>> geometric
>>> OPLS-AA
>>> 0.0.1
mattwthompson commented 3 years ago

I can reproduce that, but I definitely can't explain it on first glance. It's safe to assume that 100% of my usage is from a single file, and I'm not sure how much of Foyer's other uses involve multiple XMLs. This is a pattern inherited from OpenMM (the OpenFF Toolkit inherited as well) that's often used to load in i.e. a water force field and protein force field as separate files.

This is a pretty significant problem for tests and behavior - I can work around it for OPLS-AA by using a common pytest fixture, but loading in a one-off file and inspecting its attributes is pretty important, especially if the combining rule in the force field should line up with .apply().

There was a test for loading in directly from a file, but not one for the name="" pattern, which is more commonly-used.

ahy3nz commented 3 years ago

Um I'm a little rusty on foyer's forcefield-creation logic, but it seems the forcefield constructor (the foyer forcefield constructor, not even mentioning the logic handled in the simtk app forcefield constructor) is called twice in a nested fashion? And I think on the second, nested init-logic, that's when the discrepancy in handling name='oplsaa' vs forcefield_files=[files] happens

ahy3nz commented 3 years ago

https://gist.github.com/ahy3nz/4a14cb401067b3ce06ff71be7b89fa06

I don't want to push to this PR or make my own branch, so here's a snippet of what I tried in forcefield.py to get justin's 3 examples to work.

(I just remembered the double-constructor thing is just what happens when the foyer Validator is used, so it's not really a relevant issue here)

EDIT: actually instead of using preprocessed_files to grab version, name, and combining rule and then running into the removed-files problem, maybe just use all_files_to_load which will refer to all the persistent ffxmls

mattwthompson commented 3 years ago

I'm very confused, so I opened a new issue: https://github.com/mosdef-hub/foyer/issues/434

Some of the above ideas work for different ways of loading files (entry points, directly pointing to the file, or using name) but not all. I could be doing something wrong, but I do think it's worth splitting it out.

mattwthompson commented 3 years ago

Everything's passing locally and I've covered the behavior and tests I'm after. I think if tests pass this is ready for some eyes.

justinGilmer commented 3 years ago

@mattwthompson are all tests passing locally for you? The AZP tests seem to run into a few problematic tests.

mattwthompson commented 3 years ago

Everything was passing locally, but I was missing a couple of deps. Most should be fixed, but it also could take a couple of iterations.

mattwthompson commented 3 years ago

Okay, passing now!

mattwthompson commented 3 years ago

Thanks for the quick turnaround @justinGilmer @daico007!