Closed jchodera closed 2 years ago
To clarify, after some discussion with @proteneer:
The object models described here are primarily to make it easy to
Fundamentally, parameterized systems are just multidimensional numerical arrays and indices of assigned parameters, and any energy computations or derivatives will ultimately need to work with multidimensional numerical arrays (e.g. numpy or tensorflow arrays). These object models provide a convenient interface for humans to these arrays, but ultimately, to make it easy to build numpy methods that can be JITed, there would need to be some way to efficiently render them to unless floating-point arrays.
But the arrays are useful: The JIT will use these arrays, the optimizer will use these arrays, and any Bayesian samplers will use these arrays, so it's useful to think about how to easily make the road between object model and arrays bidirectional and convenient.
I think there's essentially two things here:
We can decide on the API later once we can agree on what the goals are here.
It sounds like we're on the same page here.
The need to keep track of consistent units. Move simtk.unit -> pint.
We would likely want the API to present one data view where parameters are individually accessed using unit-bearing quantities--intended for programmatic access to individual parameters for constructing systems or having humans manipulate parameters---and another data view where parameters are exposed as flat unitless arrays for optimization or energy computation.
We can decide if we want the System builder to only accept numpy arrays with units.
Can you clarify what you mean by "System builder" in this context?
OpenMM's system class is unwieldy since it's written in C++. We also need to keep track of unique parameter origins (eg. CH4 has 4 sets of redunance C-H HarmonicBond parameters). We care about generating derivatives w.r.t. unique parameters.
We'd definitely want to keep track of parameter origins, as well as the current values of applied parameters.
Overall, I think this is a really good idea. Some scattered thoughts, in no particular order:
LibraryCharges
section vs. those that were generated by, for example a ToolkitAM1BCC
tag. That is, the library charge could be tweaked at the base-parameter level during optimization, whereas the AM1BCC one can't (or could, if we exposed some knobs to the AM1BCC calc like n_conformers
, but exposing that sort of thing may not be useful for optimization). System
to ForceField
. For example, someone might parameterize a System
, change a torsion of interest (without changing the base parameter, so other instances of that parameter in the System
remain unchanged), and want to extract the ForceField
from that modified System
. I think this would be using the SMIRNOFF spec in a very frcmod-ish way, but the capability to generate specific SMIRKS on the fly is already prototyped in Chemper.System
object will give us the chance to rethink how to export energy-affecting settings that some MD engines prefer to have in the run script (eg long range electrostatics setting living in an AMBER mdin file). Maybe we could have some pythonic way of taking an existing mdin (or template) and modifying it to contain the desired settings?distance
of a BondChargeVirtualSite
parameter in the underlying ForceField
will affect the positions of particles in the system, and it may get tricky to resolve what this means in terms of energy gradients. This is more speculative, and could be a non-issue.It sounds like we're on the same page here.
The need to keep track of consistent units. Move simtk.unit -> pint.
We would likely want the API to present one data view where parameters are individually accessed using unit-bearing quantities--intended for programmatic access to individual parameters for constructing systems or having humans manipulate parameters---and another data view where parameters are exposed as flat unitless arrays for optimization or energy computation.
We can decide if we want the System builder to only accept numpy arrays with units.
Can you clarify what you mean by "System builder" in this context?
Formally, a mapping that takes in an OFFXML, and a standard representation for a graph + 3D coordinates (eg. SDF/Mol2) and returns a parameterized system. This was a question of when units should come into play. For example, some functional forms are unit agnostic already (eg. HarmonicBond, HarmonicAngles, etc.). However, by convention, electrostatics must always include a ONE_4PI_EPS that has been hardcoded to be 138.935456 in standard MD units. Integrators/Thermostats also have them (BOLTZ/time step, etc.). If you want to fully consistent, better start adding units in all the potentials we use.
OpenMM's system class is unwieldy since it's written in C++. We also need to keep track of unique parameter origins (eg. CH4 has 4 sets of redunance C-H HarmonicBond parameters). We care about generating derivatives w.r.t. unique parameters.
We'd definitely want to keep track of parameter origins, as well as the current values of applied parameters.
+1. We need to decide what counts as trainable parameters though, take GBSA for example, there's the standard per atom charges, radii, scaled factors, but also the scalar parameters (alpha/beta/gamma OBCs).
Apologies for the wall of text.
What do you think about a design similar to this? In particular, would this cover all the use cases we have in mind or is there too much/too little here?
For now, I wouldn't implement the ability to modify the parameter of a single-particle independently to the other interactions assigned to the same parameter. The ForceField
doesn't really support "exceptions" right now, and I'm worried about keeping it out of sync. Also, it sounds like this could be a use case of less importance.
One thing I noticed, is that the name of forces and parameter types might be different as there's not always a 1-to-1 correspondence (e.g., Electrostatics, LibraryCharges, and/or ChargeIncrement all enter a single force in the end). This feels a little weird, but I haven't thought of a clean solution so far.
# Access parameters for all particles by its ID.
ff_vdw_parameters = system.forcefield.parameters['VdW']
ff_vdw_parameters['[#1:1]-[#8]'].epsilon
ff_vdw_parameters['n12'].sigma
ff_vdw_parameters[2].sigma
# Or access by a tuple ID.
system.forcefield.parameters[('VdW', '[#1:1]-[#8]', 'sigma')]
# Modify parameter for all particles.
import pint
ureg = pint.UnitRegistry()
ff_vdw_parameters['n10'].sigma = 1.0 * ureg.angstrom
# Check the atoms, ID, and spring constant assigned to the 13th bond.
bond_force = system.forces['HarmonicBondForce']
bond_force[12].atoms
bond_force[12].parameter.id
bond_force[12].parameter.k
# Access charge information of single particle.
force_cls = system.forcefield.parameters['Electrostatics'].get_force_cls() # E.g. PMEForce
# LibraryChargeType
system.forces[force_cls][2].parameter.charge # Modifying this changes system.forcefield.
# ChargeIncrementType
system.forces[force_cls][120].parameter.charge # Modifying this changes the charge_increment in system.forcefield.
system.forces['PMEForce'][120].parameter.charge_qm # Read-only
system.forces['PMEForce'][120].parameter.charge_increment
# Check all the force field terms involving a specific particle.
for parameter_type, parameters in system.particles[3].parameters:
print(parameter_type, parameters) # For example "'VdW' [VdWParameter(sigma, epsilon), ...]"
I like the idea of a potential energy function factory design, but I'm worried it's not going to cover all the way we'll want to extend it and use. For example, I believe both JAX and Dask require/are faster with their own implementation of a numpy array. Also, it will probably force us to work downstream with a pure numpy representation of a System
, which removes some of the advantages of this new class. So I'd focus on making it very easy to implement backends to compute energies/gradients (which essentially will be these factories), while keeping in mind that we may want to implement the factory design alongside it if we want.
I think including positions and box vectors in System
would make it easier evaluating potentials and exporting to other kinds of files, but I'm happy to hear your opinions about it.
# Compute potential energy of the system.
system.positions, system.box_vectors = positions, box_vectors
system.potential_energy
# Or for custom positions and a different backend (using a KernelBackendRegistry).
system.potential_energy(positions, box_vectors, kernel_backend='jax')
# More backend customization to control how each force is computed and distributed.
backend = MixedKernelBackend(
force_backends={
LJForce: JAXKernelBackend(),
'PMEForce': OpenMMKernelBackend(),
'HarmonicBondForce': DaskKernelBackend()
},
default=NumpyKernelBackend(), # Any other force.
)
system.potential_energy(kernel_backend=backend)
In many instances, I had to work with a subset of a System
(e.g. Monte Carlo, analyze positions and recompute energies for a few atoms and forces) so I'm thinking that a smart and robust SystemView
that evaluates and create copies of the internal data lazily could be very helpful.
# Create a slice of a system as a SystemView and compute the potential of the subsystem.
# This creates a "detached" view, meaning that a change in one object doesn't affect the other.
system_view = system.slice(force=[LJForce, 'PMEForce'], particle=range(20))
system_view.potential_energy(kernel_backend=backend)
# Or pass it directly to the potential energy system as a shortcut for the same syntax?
system.potential_energy(**slice_kwargs)
# Create a slice of a system involving particles assigned to specific parameters.
# This creates a traditional view that allows manipulating the bigger system with the smaller one.
system_view = system.slice_view(parameters={
('VdW', 'n12'),
('Bonds', '[#6X4:1]-[#6X4:2]'),
})
# It can be detached later.
system_view.detach_view()
I'm not 100% sure on the following syntax to compute gradients. It's very intuitive to me, and I think in the end would be easy to implement with a JAX backend, but if there're catches I'm overlooking we can switch to something like system.get_parameter_gradient/hessian()
.
# Compute the gradient with respect to all the parameters (in the view).
# Essentially potential_energy_function() returns a callable Potential
# object that can be called and expose the gradient() method.
gradient_func = system_view.potential_energy_function().gradient(parameters=True)
# If the system encapsulate positions, and box_vectors, we can just go with.
gradient_vector = gradient_func()
gradient_vector = gradient_func(parameters=system_view.parameters)
# Or more generally
gradient_vector = gradient_func(positions[selection_of_particle_indices], box_vectors,
system_view.parameters)
# Access the gradient by index, parameter, or parameter ID.
gradient_vector[15]
gradient_vector[('VdW', 'n13', 'sigma')]
# Update forcefield.
system.forcefield[gradient_vector.parameters_ids] = system_view.parameters - 0.02*gradient_vector
# Compute forces.
gradient_vector = system_view.potential_energy_function().gradient(positions=True)()
# Or compute the Hessian for a subset of parameters.
hessian_func = system_view.potential_energy_function().gradient(parameters=True).gradient(parameters=True)
@andrrizzi's comment that I wouldn't implement the ability to modify the parameter of a single-particle independently to the other interactions assigned to the same parameter is basically the crux of the problem. For some context, no MD engine out there has a way to keep track of the origin of the forcefield parameters, and designing good classes around this problem is basically uncharted territory at this point.
After having prototyped in OpenMM/timemachine for a good 6 months on just this exact problem, I've reached the conclusion that the system construction semantics needs to completely be separated from the potential energy implementation semantics. There are several practical considerations on why this needs to be done carefully:
With the timemachine and all of its forcefield fitting routines, I've settled on the following function signature (for both JAX and custom-ops backends):
def potential(batched_conf, params, box):
...
return energy
Concrete function implementations meet the required signature using either functors or functools.partial, copy and pasted from the timemachine code:
def harmonic_bond(conf, params, box, bond_idxs, param_idxs):
ci = conf[bond_idxs[:, 0]]
cj = conf[bond_idxs[:, 1]]
dij = distance(ci, cj, box)
kbs = params[param_idxs[:, 0]]
r0s = params[param_idxs[:, 1]]
energy = np.sum(kbs/2 * np.power(dij - r0s, 2.0))
return energy
The gradients are implemented natively using jax
# bind the forcefield parameters via functools partial
harmonic_bond = functools.partial(bond_idxs=bond_idxs, param_idxs=param_idxs)
dE_dx = jax.grad(harmonic_bond, argnums=(0,))
The gradients of the system:
def total_potential(conf, params):
total_e = harmonic_bond(conf, params) + harmonic_angle(conf, params) # they share a single params
return total_e
total_dE_dx = jax.grad(total_potential, argnums=(0,))
total_dE_dp = jax.grad(total_potential, argnums=(1,))
And construction of higher order terms follow naturally. But the key here is that we must treat parameters as an opaque black box that we can then differentiate with respect to. It's currently implemented an opaque 1-dimensional vector of P total parameters that get indexed into via param_idxs
of each potential energy class.
Some quick comments for now---more later:
the name of forces and parameter types might be different as there's not always a 1-to-1 correspondence (e.g., Electrostatics, LibraryCharges, and/or ChargeIncrement all enter a single force in the end).
This may be something we want to fix in a future update. @j-wags : Thoughts?
I think including positions and box vectors in System would make it easier evaluating potentials and exporting to other kinds of files, but I'm happy to hear your opinions about it.
The original concept of System
in OpenMM was to separate dynamical variables (positions, box vectors) from the description of how to compute the energies given the dynamical variables. I think it might still be useful to keep this separation, and instead have a single object represent the dynamical information (positions and box vectors).
I like the idea of a potential energy function factory design, but I'm worried it's not going to cover all the way we'll want to extend it and use. For example, I believe both JAX and Dask require/are faster with their own implementation of a numpy array. Also, it will probably force us to work downstream with a pure numpy representation of a System, which removes some of the advantages of this new class. So I'd focus on making it very easy to implement backends to compute energies/gradients (which essentially will be these factories), while keeping in mind that we may want to implement the factory design alongside it if we want.
Any concerns about which numpy
implementation should be used could be easily addressed by one of the following mechanisms:
system.create_potential_energy()
could specify whether it should be rendered for, say, jax
, numpy
, or TensorFlow
. That would cover all of our major use cases.dict
of imports for numpy
and friends to make it more extensible.Mainly, we would want the implementation of potentials to keep in step with the parameter handlers, which would require we update the code in two places if we were to add a completely new parameter type. We might think about a design where the parameter handler would also include its own energy implementations (and, in future, possibly also its own information on how to translate this into OpenMM and other packages) allowing us to implement new potential terms in a totally modular fashion.
I also agree with pretty much everything @proteneer notes above, and want to emphasize that a primary goal here is to allow us to ultimately construction functions that expose 1D vector of unique mutable continuous parameters that we can use to compute energies and parameter gradients of the energy for the convenience of optimizers and samplers. We do also want to make it useful for manual inspection and interpretation, however. Because the same information is required for both purposes---just different views of it---I think it does make sense to either bundle them together into this class or to use objects of this class to construct the functions that map 1D parameter vectors to energies.
Thank you both. I agree with essentially everything, and I made a few modifications to the last example in my post above that should make things less confusing.
To clarify some of the choices, here are the two extra constraints for the design that I'm trying to satisfy.
ForceField
class.For 1), what I'm trying to do above is essentially to return a callable Potential
object instead of a static function that exposes a common API to be differentiated. Essentially, the sole purpose of the Backend
classes would be the construction of this object.
For 2), working with pure numpy vectors is hard. You need to know somehow the exact order of the parameters in the 1D vector to inspect its gradient and update the ForceField
accordingly. So I'm thinking of having our Potential.gradient()/fwdgrad()/revgrad()
functions return a numpy array wrapper that makes it possible to address the array elements by parameter labels.
@proteneer
is basically the crux of the problem
Do you mean that you think this is fine (i.e. not implementing it) or that this is an interesting use case and we should find a way to support exceptions that are not assigned to a SMIRKS pattern?
the system construction semantics needs to completely be separated from the potential energy implementation semantics
Absolutely agree. I was thinking the latter would be entirely encapsulated in the Backend
classes. I actually went through the (awesome) timemachine
code and examples in the readme, and tried to imagine how a Backend
/Potential
object could encapsulate the lines of code you wrote above.
@jchodera
This may be something we want to fix in a future update. @j-wags : Thoughts?
First thing that comes to mind is that we might want to have the charge model tags being children tags of Electrostatics
, which will essentially have 3 levels of indentation instead of 2.
But I think the problem is not limited to multiple tags defining one force. For exampl, in the future, the vdW
tag could support multiple potential forms. We'll then have 1 very big vdW
parameter handler (and force) class supporting multiple potentials and exposing different kinds of parameters. We should probably start a separate discussion about how we should handle this, however. ~Will open another issue later~ (done, see #343).
I think it might still be useful to keep this separation
Sounds great to me, and we can always extend System
to this capability later in case, although I think it will make things harder when dealing with a slice of the System
. I think, we'll also need to think about how we'll want to represent periodicity. Right now, I think we're "hacking" the presence/absence of box vectors in the Topology
object.
An optional argument to
system.create_potential_energy()
could specify whether it should be rendered for, say,jax
,numpy
, orTensorFlow
.
I agree. That is essentially my design, only get_potential_energy(backend='jax')
would return a differentiable object instead of returning a function. I see now that the name is confusing and we should have two different methods, one returning only a Quantity
and the other returning the potential function (whether it's a python function or an object). Less surprising. I've changed my example above to reflect this.
We could specify a dict of imports for numpy and friends to make it more extensible. I wonder if it's possible to rewrite a function line-by-line to change the origin of imports using introspection tools
If I understand this correctly, this might be tricker. For example, JAX wants you to use functional numpy programming while Dask uses the array's methods so in general I don't think we can have force objects give you 1 numpy implementation that we can reuse for accelerating/differentiating the code with different libraries. I think the backends should be able to import what they need internally anyway.
I've been quiet in this thread, but I wanted to chime in and say that I'm mostly following along here and all of this seems reasonable -- I think you all have a better understanding of the constraints and requirements here.
With regard to "many-to-one and one-to-many relationships between SMIRNOFF sections and Force
/Potential
objects", I had a good discussion with @andrrizzi this morning that included this among other topics. It's not all relevant, but my full thoughts are linked in the 0.4.0 releasenotes
reference above. comment link here
tl;dr -- the relevant part is that enclosing the charge-generating sections in the Electrostatics
tag makes sense at first glance. However, it leaves the loose ends of "what if people want to include multiple potentials that are based on charges?" and "then where do we put the VirtualSites section, which represent particles that have both electrostatics and vdW interactions?". I think there are some parallels between this decision and section-bundling that make sense to consider at the same time.
Thanks for your thoughts, Jeff! I've cross-posted that in #343 that I've split from this to discuss that issue.
@mattwthompson I think this one should get closed; I think our plans in this direction are described by the Interchange effort and if we do anything else it wouldn't be for far longer.
Thanks!
I've been discussing (with @andrrizzi @SimonBoothroyd @j-wags @proteneer) the potential for solving a number of technical debt issues we have accumulated with a new Python container class to represent a parameterized system.
To recap, we have a few major accumulated technical debt issues:
ForceField
is mainly structured around constructing an OpenMMSystem
object, which is a Swig-wrapped C++ object that violates PEP8 naming principles with an highly un-pythonic API. The overhead of these conversions can also be significant, and they are time-consuming to serialize via XML.simtk.unit
, rather than a more standard library likepint
, for handling units; this was discussed here. In addition,simtk.unit
does not easily support ensuring dimensionality is correct.CustomForce
classes which can encode many functional forms for existing parameter types or new parameter types, it becomes very difficult to decode these custom forces to convert to representations suitable for other simulation packages without a lot of specialized logic to try to detect what we encoded. OpenMM also generally mixes electrostatics and vdW interactions in a singleNonbondedForce
, which creates further complexity for decoding.System
objects are very flexible, they are inextensible without writing C++ plugins.System
object has all parameters unrolled into individual interactions, which makes it difficult to identify how origin parameters affect multiple interactions.ForceField.label_molecules()
, returns a dict of lists of tuples of lists, which should very much be replaced with a sensible object model.Considering all of these limitations, I propose we create a new object model for parameterized systems that aims to solve as many of these problems as is conceivable via a good, extensible design, and amortize the cost of its implementation over various subprojects that need different capabilities.
Our design should:
pint
that facilitates dimension-checkingTopology
object from which the system was created to make it easier to identify interactions between desired atomsI find it helpful to kick around ideas of what using the class would look like. Here's some very rough ideas for inspiration.
Suppose first that it would be useful to create something like a
ParameterSet
object that would replace the current awkwardForceField
syntax ofwith something like the more natural
This would enable manipulations like
None of this is intended to be a real suggestion for a final API, but just to give the flavor of what we could potentially implement in a manner that would let us easily solve many of the problems above.