openforcefield / openff-toolkit

The Open Forcefield Toolkit provides implementations of the SMIRNOFF format, parameterization engine, and other tools. Documentation available at http://open-forcefield-toolkit.readthedocs.io
http://openforcefield.org
MIT License
301 stars 88 forks source link

Decouple partial charge assignment from creation of OpenMM systems #619

Closed mattwthompson closed 2 years ago

mattwthompson commented 4 years ago

Is your feature request related to a problem? Please describe. Currently, the only reliable way to inspect the partial charges that are applied by a SMIRNOFF force field is to ship a topology off to OpenMM (forcefield.create_openmm_system(topology)) and query that object. (If this is not strictly true now, it will be true in the coming days as #471 is merged.)

Quoting @j-wags:

After merging Dotson's WBO PR (#582), create_openmm_system will have a new return_topology kwarg. This kwarg should be considered experimental. The reference molecules in that returned Topology will have charges on them, but currently those will reflect the charges from the underlying charge method BEFORE the chargeincrements are applied.

Describe the solution you'd like Ideally, there would be a function or two that takes in ForceField and Topology and populates the latter with partial charges that precisely match what will actually be in the simulation downstream.

Selfishly, this would make working with electrostatics in the OpenFF System prototypes easier. I think that the black box of:

Topology/ForceField --> ⬛ --> that same Topology but with charges

is functionality that should always live here. Right now it's buried in the OpenMM-adjecent functions; this will need to be stripped out eventually, and disentangling these steps may avoid code duplication, which I'd argue here is more important than just style; we'd want similar functions to use common code as much as possible so that their outputs do not diverge. Right now, it's fairly simple to grab things like vdW and valence terms from calling various ParameterHandler methods, whereas partial charges may require multiple handlers (2 right now, I think?) and doing so with care.

I recognize that the proposed changes here are non-trivial and I can absolutely use temporary workarounds for my own needs. 😄

Describe alternatives you've considered A short-term solution would be to fully populate the OpenMM system in order to get the partial charges, although my main point here is that we'd want to be able to just get the partial charges without needing to create a fully OpenMM system.

Additional context There is a lot of complexity in assigning partial charges (there are multiple steps involved in determining the partial charges that a force field will apply at the end) and we should make sure that if we expose a public API point, only the final result is exposed to the user by default. Intermediate steps (i.e. after AM1-BCC - or its future replacement! - but before charge increments) should be also probably accessible, but a little hidden, i.e. non-default).

mattwthompson commented 2 years ago

There are now (at least) two ways to get something like this behavior:

  1. ForceField.get_partial_charges does everything and tells you what the charges are. It does create an OpenMM system under the hood, but it still returns the information we want.
  2. Interchange solves the problem of needing an object model that sits between force fields + molecules and OpenMM objects:

In [1]: from openff.toolkit import Molecule, ForceField

In [2]: from openff.interchange import Interchange

In [3]: sage = ForceField("openff-2.0.0.offxml")

In [4]: ethanol = Molecule.from_smiles("CCO")

In [5]: Interchange.from_smirnoff(sage, [ethanol])['Electrostatics'].charges
Out[5]:
{TopologyKey with atom indices (0,): -0.09709999834497769 <Unit('elementary_charge')>,
 TopologyKey with atom indices (1,): 0.13142999882499376 <Unit('elementary_charge')>,
 TopologyKey with atom indices (2,): -0.6013399971028169 <Unit('elementary_charge')>,
 TopologyKey with atom indices (3,): 0.0447599987188975 <Unit('elementary_charge')>,
 TopologyKey with atom indices (4,): 0.0447599987188975 <Unit('elementary_charge')>,
 TopologyKey with atom indices (5,): 0.0447599987188975 <Unit('elementary_charge')>,
 TopologyKey with atom indices (6,): 0.017319998393456142 <Unit('elementary_charge')>,
 TopologyKey with atom indices (7,): 0.017319998393456142 <Unit('elementary_charge')>,
 TopologyKey with atom indices (8,): 0.39809000367919606 <Unit('elementary_charge')>}