openmm / openmm-ml

High level API for using machine learning models in OpenMM simulations
Other
80 stars 25 forks source link

`createMixedSystem` enhancement for switching between MM and `TorchForce` with `GlobalParameter`s #12

Closed dominicrufa closed 6 months ago

dominicrufa commented 3 years ago

I think it might be worthwhile to add an enhancement to the createMixedSystem functionality where instead of modelling a region (set of atoms) with a TorchForce and the rest of the system with conventional MM forcefields, we might want to have GlobalParameters that turn off the MM-ff of a region of interest and turn on the TorchForce with some GlobalParameters. I think it would also be very valuable to have a temperature-like scale factor that 'softens' the interactions between the MM/TorchForce and MM-only region in a REST-like capacity.

I have a (complicated) working implementation of this (note that this functionality only modifies the existing MM-forces. the user needs to specify a list of atoms constituting a region for which the MM-energies should be scaled down. The TorchForce generator lives here ). The most complicated part is defining the MM-forces (namely softcoring the intra-region nb interactions so as to avoid singularities when the MM-energies are switched off).

Would it be possible that we might integrate the linked code (or a variation of it) into openmm-ml and add some pytests? I'd be happy to help/troubleshoot.

jchodera commented 3 years ago

@peastman : Just to be clear here, the simplest broadly applicable use case just needs two parameters, and there is no need for softcore for common uses:

Here's a rough sketch: image

peastman commented 2 years ago

I think the lambda parameter would be fairly easy to implement. We could do it like this:

I don't understand what the alpha_max parameter does. Is it just to reinterpret the scale of lambda? If so, I don't understand what it's useful for. Or does it affect some interactions differently than others?

jchodera commented 2 years ago

Tagging @dominicrufa here for further input.

The alpha_max parameter allows "local heating" of the ML subsystem at intermediate lambda values. This is often done in alchemical free energy calculations via a method called "REST" (replica exchange with solute tempering) to improve sampling, because it reduces barriers to reorganization at intermediate lambda values.

peastman commented 2 years ago

For the bonded interactions, that would be easy to do by applying a scale factor in the CustomCVForce. For the nonbonded interactions within the ML region it's more complicated. For each exception, you need to calculate charge and epsilon such that the total interaction comes out exactly right given lambda and alpha.

peastman commented 2 years ago

Alternatively you could just keep all the exceptions as 0. Then add a CustomBondForce inside the CustomCVForce to compute their energies.

dominicrufa commented 2 years ago

I handle the MM region with this; the nonbonded regions withing ML region, i do what @peastman suggested: https://github.com/choderalab/qmlify/blob/master/qmlify/openmm_torch/force_hybridization.py. after this, you just have to add the TorchForce (with a global parameter attached to it)

peastman commented 2 years ago

How about this. We can add a boolean flag to createMixedSystem() that tells it to put everything inside a CustomCVForce as described above. It will also add a single lambda parameter that interpolates between MM and ML forces. It won't try to implement any particular enhanced sampling methods, but with everything separated out like that, you can easily change the energy expression of the CustomCVForce to depend on additional parameters in whatever way you want.

jchodera commented 2 years ago

That sounds like a great solution to me!

By "everything", do you mean that it would put only the MM region being removed in the CustomCVForce? If the whole system, how would this work for replacing only part of the system with ML?

Is there any reason not to use the CustomCVForce solution by default? It seems like it wouldn't slow things down and would provide a lot of flexibility that can be illustrated in tutorials.

@dominicrufa: Any concerns with this approach?

peastman commented 2 years ago

By "everything", do you mean that it would put only the MM region being removed in the CustomCVForce?

Correct. For example, there would be a HarmonicBondForce directly contained in the System, with all bonds outside the ML region. And there would be a second HarmonicBondForce inside the CustomCVForce, with the bonds inside the ML region.

Is there any reason not to use the CustomCVForce solution by default?

CustomCVForce does add overhead. Whether that's significant depends on your potential. Compared to typical MM force fields, it's significant. Compared to most ML potentials, it probably isn't. It also adds complexity.

dominicrufa commented 2 years ago

it's not obvious to me how intra-MM nonbonded interactions will behave if you allow for a simple linear scaling. ive seen nans happen with ani2x using that approach and aggressive timesteps, which is why i wrote in softcores.

peastman commented 2 years ago

Any idea what causes that? The exact shape of the short range repulsive potential is different for the two. But still, any conformation that is very high energy in one should also be very high energy in the other.

jchodera commented 2 years ago

it's not obvious to me how intra-MM nonbonded interactions will behave if you allow for a simple linear scaling. ive seen nans happen with ani2x using that approach and aggressive timesteps, which is why i wrote in softcores.

What are "aggressive" timesteps here?

Presumably, this feature would initially require the ML subsystem uses no constraints in the MM parameters (which may require some extensions to ForceField to specify no constraints should be applied to a list of atom indices), and that we are confined to 1-2 fs timesteps.

ive seen nans happen with ani2x using that approach and aggressive timesteps, which is why i wrote in softcores.

Would be great to know which force term was responsible. It's hard to imagine how LJ could be responsible for this.

peastman commented 2 years ago

By default, createMixedSystem() removes all constraints in the ML region. You can override that with removeConstraints=False. If you plan to interpolate between the two, it's probably best to just set constraints=None when creating the MM system. Since you'll already be simulating part of the system with the MM force field and no constraints, there's not likely to be a benefit to including constraints elsewhere.

dominicrufa commented 2 years ago

What are "aggressive" timesteps here?

4fs

Would be great to know which force term was responsible. It's hard to imagine how LJ could be responsible for this.

I'll have to dig into this more to figure out what the problematic force is.

jchodera commented 2 years ago

From my conversations with the ANI folks and @raimis @giadefa, we need to stick to 1-2 fs timesteps, since ANI is not stable at 4 fs timesteps. I think this issue arises from ANI, and not the MM system.

We've discussed some ways we could reach 4 fs outer timesteps with multiple timestep methods, but we would essentially require a cheaper surrogate potential that would be used for inner timesteps.

peastman commented 2 years ago

The changes described above are implemented in #24. Is it ok to close this now?

jchodera commented 2 years ago

@dominicrufa : Can you see if this fits your needs?

dominicrufa commented 2 years ago

@jchodera: this issue is blocking me