openforcefield / openff-toolkit

The Open Forcefield Toolkit provides implementations of the SMIRNOFF format, parameterization engine, and other tools. Documentation available at http://open-forcefield-toolkit.readthedocs.io
http://openforcefield.org
MIT License
313 stars 92 forks source link

Long term sustainability of ParameterHandler #343

Open andrrizzi opened 5 years ago

andrrizzi commented 5 years ago

Separating the observation in this comment.

Goal

It would be ideal to maintain a 1-to-1 correspondence between forces and SMIRNOFF tags/ParameterHandlers. This will make it much easier after we generate the parametrized System go back and modify the original parameter in the ForceField based on, for example, gradient information.

Description

With the current spec, I think we'll have problems due in both directions.

  1. Many SMIRNOFF tag -> single force. Currently the <Electrostatics>, <LibraryCharges> and <ChargeIncrement> are separate parameter handlers that essentially enter the definition of a single force object controlling the Coulomb interactions.
  2. Single SMIRNOFF tag -> multiple forces. Most of our tags support a potential attribute (e.g. <vdW version="0.3" potential="Lennard-Jones-12-6" ...>). Right now we support only 1 potential function, and the vdWParameterHandler creates a LJ force object, but when we'll start supporting alternative potentials, the parameter handler will have to generate either different force objects (1 for each potential) or a single object that encapsulates the logic for multiple potentials and expose different types of parameters through its API. None of these solutions seem ideal to me.

Possible solutions

  1. Make <LibraryCharges> and <ChargeIncrement> children tags of <Electrostatics>?
  2. Remove the potential attribute and change the name of the SMIRNOFF tag to identify the specific force rather than the generic type of interaction it is modelling?
andrrizzi commented 5 years ago

Originally posted by @j-wags here in #310.

With regard to "many-to-one and one-to-many relationships between SMIRNOFF sections and Force/Potential objects", I had a good discussion with @andrrizzi this morning that included this among other topics. It's not all relevant, but my full thoughts are linked in the 0.4.0 releasenotes reference above. comment link here

tl;dr -- the relevant part is that enclosing the charge-generating sections in the Electrostatics tag makes sense at first glance. However, it leaves the loose ends of "what if people want to include multiple potentials that are based on charges?" and "then where do we put the VirtualSites section, which represent particles that have both electrostatics and vdW interactions?". I think there are some parallels between this decision and section-bundling that may may sense to consider at the same time.

jchodera commented 5 years ago

As noted in #310, we are aiming to mitigate many of these concerns by migrating to a parameterized System object model that is closer to how SMIRNOFF separates parameters than the OpenMM System object model.

I do think it would be useful to think about how best to handle parameter handlers like constraints (which require access to equilibrium bond length parameters) and charge models (which set or modify partial charges). Is there some general way we could handle this while maintaining modularity, and extending these concepts for the future (e.g. for virtual sites, polarizable sites, multipoles, GB models, etc.)?

j-wags commented 5 years ago

No conclusions here, but some more thoughts. I had a conversations that touched on this topic with @andrrizzi and @MSchauperl. The relevant parts were:

jchodera commented 5 years ago

Tagging @peastman, who may have valuable input here as well.

peastman commented 5 years ago

I've been wrestling with these sorts of issues in OpenMM for a long time. The way it currently handles them is serviceable but not ideal. I'm not sure what would be better though. Part of the problem is that force field designers do a lot of wacky things (especially with coarse grained force fields, but not exclusively), and any assumption you make, someone will be unhappy with it.

For example, in OpenMM we have separate force objects for Coulomb (NonbondedForce) and implicit solvent (GBSAOBCForce or CustomGBForce). Both of those objects depend on atomic charges. Should the XML file be structured so you only specify the charges once? That would be convenient and avoid potential mistakes. But if you try to enforce that, someone is going to come along and complain that their new implicit solvent model requires different charges in the two places. So we have a design that lets you do it either way.

Another example is the treatment of dispersion and repulsion. Usually we treat these as a single interaction with the LJ form, with both parts depending on the same sigma and epsilon parameters. But they don't have to be. Does that mean you should have separate tags for dispersion and repulsion? An extreme example is the HIPPO force field, which I recently implemented in OpenMM. It has separate terms for Coulomb, dispersion, repulsion, and charge transfer. But although they're computed separately, they also share parameters. For example, the repulsion is anisotropic, and depends on the same multipole moments used for Coulomb. In the end I combined all of them into a single Force object, but it's an object that combines a bunch of conceptually distinct interactions.

However, while that means it's possible, we should be careful not to adopt some sort of data structure that would only work in XML.

You can do a lot of wild and crazy things in XML. My advice: don't. :) XML can be a simple, easy to read, easy to edit by hand format. But if you start using the more advanced features, it can also become incomprehensible.