Add support for constant pH simulation

I am opening a new issue to start a discussion of how we can have OpenMM support constant-pH simulation, which would allow protonation states of molecules in the system (both proteins and small molecules) to change during the simulation.

I propose we adopt a modification of the Monte Carlo scheme of Mongan and Case described in Reference [1]. In this scheme, a topology containing all possible protons is created, and groups of charges on titratable groups are modified during the course of the run to reflect changes in protonation state as protons are added or removed.

For implicit solvent simulations, the scheme goes like this:

Select the number $m$ of titratable groups that will be modified with probability $P(m)$.
Select $m$ titratable groups to be modified with uniform probability.
For each titratable group to be modified, select a new protonation state with uniform probability.
Compute the initial potential energy.
Modify the protonation states to their new states by changing charges.
Compute the final potential energy.
Accept or reject with a modified Metropolis criteria that incorporates the reference pKas for the titratable groups and the simulation pH.

For explicit solvent simulations, this scheme must be modified such that Steps 4-6 are split up over $N$ steps of velocity Verlet dynamics, and the total work $W$ is accumulated instead of the potential energy difference:

Select the number $m$ of titratable groups that will be modified with probability $P(m)$.
Select $m$ titratable groups to be modified with uniform probability.
For each titratable group to be modified, select a new protonation state with uniform probability.
Run $N$ steps of velocity Verlet dynamics where we alternate between:
- updating the charges by $1/N$ the total charge change, accumulating the potential energy change as work $W$
- run one step of velocity Verlet dynamics
  1. Accept or reject with a modified Metropolis criteria that incorporates the reference pKas for the titratable groups and the simulation pH. This scheme is essentially a modification of the scheme proposed in Reference [2] using NCMC (Reference [3]). Some minor details are skipped in this overview sketch.

The implicit solvent scheme is a special case of the explicit solvent scheme where $N$ is set to $0$ steps of velocity Verlet. Therefore, both schemes can be handled by the same interface.

Some additional tools are required to set up these simulations:

A tool to create a Topology object with desired titratable amino acids converted to titratable forms of these residues, along with a list of the titratable groups and associated possible charge states and reference pKas.
A tool to calibrate the reference free energy differences for terminally-blocked versions of these titratable amino acids given the specific simulation details (implicit solvent model or PME/RF parameters, cutoff), since reference values are dependent on the precise simulation details.
A way to analyze a simulation trajectory to extract information about the protonation state history.

I had started to implement this functionality through a pure Python implementation (using Force::updateParametersInContext()), and have a very basic working version. We essentially have to decide whether this functionality should be at the pure Python level, or might be something we want to implement at one of the C++ API layers, and what degree of functionality we want at the C++ level.

My current Python implementation requires the system be set up for simulation by the AMBER tools, generating corresponding prmtop and cpin files:

# Load and build the AMBER system.
import simtk.openmm.app as app
inpcrd = app.AmberInpcrdFile(inpcrd_filename)
prmtop = app.AmberPrmtopFile(prmtop_filename)
system = prmtop.createSystem(implicitSolvent=app.OBC2, nonbondedMethod=app.NoCutoff, constraints=app.HBonds)

# Initialize Monte Carlo titration.
mc_titration = MonteCarloTitration(system, temperature, pH, prmtop, cpin_filename)

# Run constant pH dynamics.
for iteration in range(niterations):
    # Run some dynamics.
    integrator.step(nsteps)

    # Attempt protonation state changes.
    mc_titration.update(context)

One simple way we could make this efficient and minimally complex would be to create a C++ MonteCarloTitrationForce object that handles the updating of protonation states on the fly, avoiding the need for external calls to mc_titration.update(context). A Python tool could still help configure the MonteCarloTitrationForce object from AMBER prmtop files or OpenMM Topology objects, but the protonation state changes would be handled automatically during integration.

This same functionality could also be extended to handle Monte Carlo switching between tautomeric states of small molecules, though this might require allowing Lennard-Jones parameters to change as well.

References

[1] Mongan J, Case DA, and McCammon JA. Constant pH molecular dynamics in generalized Born implicit solvent. J Comput Chem 25:2038, 2004. DOI

[2] Stern HA. Molecular simulation with variable protonation states at constant pH. JCP 126:164112, 2007. DOI

[3] Nonequilibrium candidate Monte Carlo is an efficient tool for equilibrium simulation. PNAS 108:E1009, 2011. DOI

openmm / openmm

Add support for constant pH simulation #19

References