openforcefield / openff-interchange

A project (and object) for storing, manipulating, and converting molecular mechanics data.
https://docs.openforcefield.org/projects/interchange
MIT License
71 stars 23 forks source link

Enforcing order of potential handlers #35

Closed mattwthompson closed 1 year ago

mattwthompson commented 4 years ago

The current pattern in the toolkit is to populate forces by iterating through ParameterHandler objects and calling the create_force method of each. For exporting to OpenMM, this requires some custom code that, among other things, enforces order so that some handlers are only ever called after other ones. I believe, for example, ChargeIncrementModelHandler can only ever apply charge increment after AM1-BCC charges have been applied by ToolkitAM1BCC. This is done by tagging each class with a ._DEPENDENCY attribute that specifies what must come before each, resolving that graph, and using it to specify the other they'll be iterated through.

Maintaining this behavior is probably unavoidable, so the question if there's a better way to implement it or not.

j-wags commented 4 years ago

Do PotentialHandlers need order? I get the sense that energy computation should be order-independent. A System will only need to know about order in the case of reparameterization, in which case the ForceField will contain the ordering logic.

mattwthompson commented 4 years ago

The energy evaluation part should not, but the parametrization part should. If that logic stays in the toolkit, then it won't think much changed between the OpenMM System present and OpenFF System future. This might come down to the question of ForceField.create_openff_system vs. System.parametrize(topology, forcefield, ...). A downside of the current monkey-patching is that the responsibility is blurred:

https://github.com/openforcefield/openff-system/blob/f3a48953d50d1b806c7c10f0142cf6f92f70a0be/openff/system/stubs.py#L27-L51 Now, order doesn't need to be enforced, because the Electrostatics tagged is hacked together to only store the partial charge. But you can see that if it called the other handlers, order would be more relevant, as is in the toolkit now.

mattwthompson commented 4 years ago

Another case: constraints usually don't specify the distance they're to be constrained to.

     <Constraints version="0.3">
         <!-- constrain all bonds to hydrogen to their equilibrium bond length -->
         <Constraint smirks="[#1:1]-[*:2]" id="c1"></Constraint>
     </Constraints>

The equilibrium bond length is supposed to be derived from bonds, meaning a constrained bond (i.e. any H-X bond here) will need to be gather information from two separate handlers. The toolkit handles this completely, as far as I can tell, but the details are baked into the create_force functions (alongside OpenMM details), not only the find_matches part, which is where the typing step happens. There are several details to consider, which I won't repeat except for copy+pasting

Here's the section of the SMIRNOFF spec:


<Constraints>

Bond length or angle constraints can be specified through a <Constraints> block, which can constrain bonds to their equilibrium lengths or specify an interatomic constraint distance. Two atoms must be tagged in the smirks attribute of each <Constraint> record.

To constrain the separation between two atoms to their equilibrium bond length, it is critical that a <Bonds> record be specified for those atoms:

<Constraints version="0.3" >
  <!-- constrain all bonds to hydrogen to their equilibrium bond length -->
  <Constraint smirks="[#1:1]-[*:2]" />
</Constraints>

Note that the two atoms must be bonded in the specified Topology for the equilibrium bond length to be used.

To specify the constraint distance, or constrain two atoms that are not directly bonded (such as the hydrogens in rigid water models), specify the distance attribute (and optional distance_unit attribute for the <Constraints> tag):

<Constraints version="0.3">
  <!-- constrain water O-H bond to equilibrium bond length (overrides earlier constraint) -->
  <Constraint smirks="[#1:1]-[#8X2H2:2]-[#1]" distance="0.9572*angstrom"/>
  <!-- constrain water H...H, calculating equilibrium length from H-O-H equilibrium angle and H-O equilibrium bond lengths -->
  <Constraint smirks="[#1:1]-[#8X2H2]-[#1:2]" distance="1.8532*angstrom"/>
</Constraints>

Typical molecular simulation practice is to constrain all bonds to hydrogen to their equilibrium bond lengths and enforce rigid TIP3P geometry on water molecules:

<Constraints version="0.3">
  <!-- constrain all bonds to hydrogen to their equilibrium bond length -->
  <Constraint smirks="[#1:1]-[*:2]" />
  <!-- TIP3P rigid water -->
  <Constraint smirks="[#1:1]-[#8X2H2:2]-[#1]" distance="0.9572*angstrom"/>
  <Constraint smirks="[#1:1]-[#8X2H2]-[#1:2]" distance="1.8532*angstrom"/>
</Constraints>
Constraint section tag version Required tag attributes and default values Required parameter attributes Optional parameter attributes
0.3 smirks distance
mattwthompson commented 1 year ago

This is no longer needed after refactoring from Chain of Responsibility to something more like a Builder pattern.