openforcefield / standards

A repository of the standards employed across the Open Force Field Consortium.
https://openforcefield.github.io/standards
MIT License
1 stars 3 forks source link

OFF-EP 0005b #34

Closed jchodera closed 2 years ago

jchodera commented 2 years ago

An alternative proposal to #30 to fix #29.

Other modifications that may be possible:

jchodera commented 2 years ago

I haven't yet updated the smirnoff.md from #30, but please take a look at the proposed alternative OFF-EP 0005.

davidlmobley commented 2 years ago

I think I'm fine with this as well; I'd like to hear from @mattwthompson and @j-wags but it still seems relatively compact.

mattwthompson commented 2 years ago

I don't have a seat on the committee and but since I was asked -

30 solves, in my view, the problems we have identified as short-term blockers for version 0.11.0 of the toolkit (and a 0.2.0 release of Interchange, which will be paired with it) as I am trying to prevent #29 from creeping into Interchange. I intended for it to be unambiguously discrete in scope to the changes I need, both because I have not done research cycles on other topics and because I hoped it would be easier to review a more narrow proposal; the assumption is that it is easier to gain consensus on one move part than several. (This has been my hope with the OFF-EPs I have written, that targeted, narrow proposals would be easier to review, much like PRs to source code - even at the cost of having a larger number of proposals creating extra chatter.)

I think specifying physical constants would be an excellent improvement to SMIRNOFF, but I'm not sure it's in scope here and think it would be better split out into a separate proposal. Two questions that immediately come to mind are

PME (and other methods) are permissible approximations to Ewald as long as they are controlled.

These needs clarification, presumably with coming changes to smirnoff.md. What does "controlled" mean here? What approximations are permissible, i.e. is it fine that Amber and OpenMM use different B-splines of different order?

Should the default dielectric be 78.3 (water) or 78.5 (consistency with Amber)? Does this difference matter? Separately from this proposal, it would be worth seeing if OpenFF's fitting/benchmarking pipelines are consistent - both in terms of this constant and any other changes to the electrostatics definitions.

I have no idea if the reaction field expression given in the example is supported by engines other than OpenMM; since it's not been an operational priority, I haven't dug into it and it's not a technique I'm familiar with. So I'm not able to given feedback on it except to repeat the same concern as above - if it's a requirement, it might be ignored when interfacing with slightly different implementations.

I also don't know how different engines compute electrostatics of intramolecular interactions. To be honest, it's only recently come to my attention that this is a detail that engines have opinions about.

I just need the specification revised to support the most common use case of our force fields: PME (with details possibly left unspecific) for periodic systems and something-close-to-exact-Coulomb for non-periodic systems. This proposal seems like it only adds content, if I'm reading things accurately. Assuming that updates to smirnoff.md in a final version of this proposal don't modify this, that need is met. I'm unable to provide useful feedback on some other points, though, so I will defer to the committee's judgement unless asked again.

j-wags commented 2 years ago

PME (and other methods) are permissible approximations to Ewald as long as they are controlled.

These needs clarification, presumably with coming changes to smirnoff.md. What does "controlled" mean here? What approximations are permissible, i.e. is it fine that Amber and OpenMM use different B-splines of different order?

(not blocking) This would be good to answer. I don't see it as a hard blocker for my approval, and I don't know enough about PME implementations to write this in smirnoff.md myself. But I'd basically trust any reasonable scheme agreed upon by more knowledgeable committee members.

jchodera commented 2 years ago

@mattwthompson @j-wags : Thanks so much for your detailed feedback!

Scope: My assumed scope differed from @mattwthompson's #29 in that I considered the scope of this OFF-EP to be "eliminate ambiguities in the definition of the Electrostatics tag", rather than "eliminate one ambiguity in the definition but leave a boatload of other ambiguities". I appreciate that there is tension between the desire to make narrowly-scoped, easily reviewable changes and the amount of additional work required to review, accept, and then implement each change, especially if this requires backward- (and possibly forward-)compatibility. I'm totally OK with implementing a series of narrowly-scoped changes if the software scientists are fine with implementing all of them, but it seems like this has the potential to create a great deal more work.

I think specifying physical constants would be an excellent improvement to SMIRNOFF, but I'm not sure it's in scope here and think it would be better split out into a separate proposal.

With my assumed scope of eliminating ambiguities, this was an ambiguity in the definition of how the potential was calculated that we need to explicitly address in this OFF-EP.

How should implementations handle exports to engines that use different versions of CODATA (or a different system altogether)? Forbidding export to to i.e. GROMACS 2019 (which uses CODATA 2010) would be onerous, but allowing implementations to skip this check altogether implies it's not necessary - or should be communicated as a recommendation, not a requirement.

If we know which constants a package uses internally to compute the potential energy from our parameters, it is trivial to adjust the parameters on export to ensure the energies match their intended values. This is only possible if we specify what physical constants we intend the spec to use. If we do not, we will just get different energies in each package.

Assuming one doesn't wish SMIRNOFF to be on CODATA 2018 forever, what is the process for applying updates?

As noted above, a future OFF-EP could explicitly specify the physical constants used at a higher level---perhaps at top level---and remove the specification of which physical constants are used from the Electrostatics tag. We could then update the physical constants as desired. This change was outside of my assumed scope, however---which was confined to just the Electrostatics tag.

These needs clarification, presumably with coming changes to smirnoff.md. What does "controlled" mean here? What approximations are permissible, i.e. is it fine that Amber and OpenMM use different B-splines of different order?

We need to agree on a criteria for this, but my assumption here is that we will first need to see some data before we can construct a criteria. One possible approach, for example, would be to curate a set of test systems for which we could provide Interchange parameters and reference energies for a true Ewald implementation, and benchmark the exported energies in other packages with a variety of settings (e.g. different pme-tolerance values in OpenMM with PME). We could then specify "your settings should reproduce these energies to within X kcal/mol maximum deviation on the provided systems and snapshots". This is but one idea, but it would provide a concrete way of determining whether the force field is correctly implemented in a package with particular settings.

Should the default dielectric be 78.3 (water) or 78.5 (consistency with Amber)? Does this difference matter? Separately from this proposal, it would be worth seeing if OpenFF's fitting/benchmarking pipelines are consistent - both in terms of this constant and any other changes to the electrostatics definitions.

We can choose not to define a default dielectric in this spec.

I have no idea if the reaction field expression given in the example is supported by engines other than OpenMM; since it's not been an operational priority, I haven't dug into it and it's not a technique I'm familiar with. So I'm not able to given feedback on it except to repeat the same concern as above - if it's a requirement, it might be ignored when interfacing with slightly different implementations.

I did not define any keywords for reaction field electrostatics because I haven't had time to review which functional forms are implemented in different packages. It will require some effort to make sure implementations are standardized or can be implemented or approximated in multiple packages. For now, the safest thing to do is either exclude them (which is inconvenient, since we know we need to support Cresset with an optimized reaction field form) or explicitly specify the functional form (which likely for Cresset with their OpenMM implementation). Some refinement of this part of the proposal may be necessary, however, if we want to easily be able to render this to NonbondedForce instead of CustomNonbondedForce in Interchange.

I also don't know how different engines compute electrostatics of intramolecular interactions. To be honest, it's only recently come to my attention that this is a detail that engines have opinions about.

These details matter for the energies they produce, so we cannot ignore them.

You and @mattwthompson have done most of the writing on this, so I'd be happy to edit the smirnoff.md file with the proposed changes, and the "Backward compatibility" section of this EP with a more thorough recommendation for up-conversion (I'll have to write an up-converter anyway, so it couldn't hurt to specify exactly what it will do). Please tell me if you'd like me to make these edits and I can do so on Monday!

@j-wags : Please feel free to make further changes to the branch directly! My goal here was to provide the basis of an alternative OFF-EP that more thoroughly addressed all identified ambiguities in the Electrostatics tag, but I am not committed to any of the specific solutions proposed here.

I'll respond to the other comments inline.

mattwthompson commented 2 years ago

The remaining loose ends (PME details, reaction field implementation, intramolecular interactions) are each improved or not made worse by this proposal, so I don't see any reason to reject these proposals (a. I think we're in a good position to tackle them in the future.

jchodera commented 2 years ago

Thanks for the changes, @j-wags!

Do we expect that (1) we will want to use a cutoff for non-periodic systems (which are primarily intended to match QM calculations) at some point in the future, and (2) we will want to at some point enable the cutoffs to be distinct in these cases? If so, this seems like a good future-proof change. I just couldn't envision a use case for either.

j-wags commented 2 years ago

Do we expect that (1) we will want to use a cutoff for non-periodic systems (which are primarily intended to match QM calculations) at some point in the future, and (2) we will want to at some point enable the cutoffs to be distinct in these cases?

No opinion - my experience with cutoff potentials is limited. @SimonBoothroyd or @davidlmobley, do you have an opinion?

mattwthompson commented 2 years ago

Do we expect that (1) we will want to use a cutoff for non-periodic systems

The Shirts, 2017 way of testing for energy equivalence (which I would like to be able to do, https://github.com/openforcefield/standards/issues/6) uses a hard cut-off for electrostatics. But I don't know if this is something that must be supported for both periodic and non-periodic systems. I can't think of a use case outside of that (which would be strictly testing-only and never touch production).

If there is any friction associated with officially supporting this feature I'd just modify a system after preparation to do this, which is only a tiny detour away from official SMIRNOFF.

(2) we will want to at some point enable the cutoffs to be distinct in these cases?

I can't think of any cases; a cutoff being used in either periodic or non-periodic treatments should be rare and the possibility of shipping a force field using both should be vanishingly rare ... but I guess possible?

davidlmobley commented 2 years ago

Re the question directed at me: I personally don't expect a hard cutoff for nonperiodic systems. And I don't see a use case for separate cutoffs in the two cases.

I also read everything else here and am fine with where this has ended up, so I approve this EP as written in its current form. @SimonBoothroyd ?

j-wags commented 2 years ago

Thanks for the close read, @SimonBoothroyd and everyone else! I'll take another swing this today - basically

Some of the defaults that Simon suggested changing to none will require a bit more thought. I remember there's a mess of complexity when a default value has to change, but I can't verbalize it well now. I'll think this through and write it up with my edits.

j-wags commented 2 years ago

Ok, I think this is in good shape now. Thanks again @SimonBoothroyd for catching the inconsistencies.

The notable changes now (beyond just fixing inconsistencies) are:

Only cutoff="none" and switch_width="none" is allowed for Ewald methods---only finite-ranged methods should set these to a non-none value.

to

It is possible to define an Electrostatics section where no potential uses cutoff, switch_width, or solvent_dielectric. In these cases it is strongly recommended that these values be set to none to avoid ambiguity.