Clarification of water box simulation setup

varun-go commented 9 months ago

I have modified the test simulation of alanine dipeptide for a box of TIP3P water using the ANI1ccx model. I would like to clarify if my simulation script is correct. One concern I have is if the periodic boundary conditions are being taken into account correctly. Any clarifications or suggestions would be helpful. Thank you!

import openmmtools
import os
from openmm.app import *
from openmm import *
from openmm.unit import *
import sys

# Get the water box system
water = openmmtools.testsystems.WaterBox(model='spce')

# Remove MM constraints
while water.system.getNumConstraints() > 0:
  water.system.removeConstraint(0)

# Remove MM forces
while water.system.getNumForces() > 0:
  water.system.removeForce(0)

# The system should not contain any additional force and constrains
assert water.system.getNumConstraints() == 0
assert water.system.getNumForces() == 0

# Get the list of atomic numbers
atomic_numbers = [atom.element.atomic_number for atom in water.topology.atoms()]

import torch as pt
from torchani.models import ANI1ccx
from NNPOps import OptimizedTorchANI
device = pt.device('cuda' if pt.cuda.is_available() else 'cpu')

class NNP(pt.nn.Module):

  def __init__(self, atomic_numbers):

    super().__init__()

    # Store the atomic numbers
    self.atomic_numbers = pt.tensor(atomic_numbers).unsqueeze(0)

    # Create an ANI-1ccx model
    self.model = ANI1ccx(periodic_table_index=True)

    # # Accelerate the model
    self.model = OptimizedTorchANI(self.model, self.atomic_numbers)

  def forward(self, positions, boxvectors):

    # tensor of booleans for periodic boundary conditions
    pbc = pt.tensor(([True,True,True]))

    # Prepare the positions
    positions = positions.unsqueeze(0).float() * 10 # nm --> Å
    boxvectors = boxvectors * 10 # nm --> Å

    # Run ANI
    result = self.model((self.atomic_numbers, positions),
                        cell=boxvectors, 
                        pbc=pbc.to(device='cuda'))

    # Get the potential energy
    energy = result.energies[0] * 2625.5 # Hartree --> kJ/mol

    return energy

# Create an instance of the model
nnp = NNP(atomic_numbers)

from openmmtorch import TorchForce

print('Loading model!')
# Save the NNP to a file and load it with OpenMM-Torch
pt.jit.script(nnp).save('water_model_ani1ccx_nvt_pbc.pt')
force = TorchForce('water_model_ani1ccx_nvt_pbc.pt')
force.setUsesPeriodicBoundaryConditions(True)  # by default, PBC is off
print('Loaded model!')

# Add the NNP to the system
water.system.addForce(force)
assert water.system.getNumForces() == 1

import sys
from openmm import LangevinMiddleIntegrator
from openmm.app import (Simulation, StateDataReporter, 
                        PDBReporter, PDBFile, 
                        DCDReporter, CheckpointReporter)
from openmm.unit import kelvin, picosecond, femtosecond

# Create an integrator with a time step of 1 fs
temperature = 300 * kelvin
frictionCoeff = 1 / picosecond
timeStep = 0.1 * femtosecond
integrator = LangevinMiddleIntegrator(temperature, frictionCoeff, timeStep)

# Create a simulation and set the initial positions and velocities
platform = Platform.getPlatformByName('CUDA')
simulation = Simulation(water.topology, water.system, integrator, platform)
simulation.context.setPositions(water.positions)

# write out the initial configuration to a pdb file
positions = simulation.context.getState(getPositions=True).getPositions()
PDBFile.writeFile(simulation.topology, positions, open('box_water_ani1ccx_nvt_pbc.pdb', 'w'))

# simulation.context.setVelocitiesToTemperature(temperature) # This does not work (https://github.com/openmm/openmm-torch/issues/61)

# energy minimization
print('starting energy minimization...')
simulation.minimizeEnergy()
print('finished energy minimization!')

RaulPPelaez commented 9 months ago

Are you seeing some unexpected behavior? Your code looks ok at first glance. You are correctly turning on PBC.

peastman commented 9 months ago

You might look at OpenMM-ML. It provides a much easier way to set up the system.

The calls to .cuda() shouldn't be needed. It automatically transfers everything to the correct device based on the platform you choose.

varun-go commented 9 months ago

Thank you for the responses. I will check out the OpenMM-ML package.

@RaulPPelaez, yes, I am observing odd behavior in the simulation. The system temperature fluctates around the setpoint of 300 K. However, I observe that the waters do not diffuse at all. The first video below shows the entire system of water. The second video shows the same simulation but where I focus on a few water molecules. The molecules appear to translate nearly in unison.

I am unsure if this behavior is expected for this potential or if there is an issue with how the simulation is set up. A paper on arxiv reports using the ANI-1ccx potential to simulate a box of water and I am using this study as my reference.

https://github.com/openmm/openmm-torch/assets/65908747/7bd6d19f-0c49-4295-ab31-66ddd1a6ce5c

https://github.com/openmm/openmm-torch/assets/65908747/4c5c8052-59b8-4283-b135-667bbd3883c2

peastman commented 9 months ago

The water molecules are tightly packed together and linked by a network of hydrogen bonds. On short time scales, they can't diffuse independently. Each molecule is boxed in by its neighbors. On longer time scales, you should see more diffusion.

varun-go commented 9 months ago

Sorry, I should have provided more details regarding the ANI1-ccx simulation. The simulation was run for nearly two ns, and the visualization I showed is from the last ns of the simulation. The trajectory shown is an output frequency of 1 picosecond. I observe this kind of motion throughout the trajectory.

While I understand that the network of hydrogen bonds in a tightly packed box can restrain diffusion, I observe significant mobility in MM water models for the same simulation timescale (two ns). The video below tracks a few waters from the last ns of a two ns simulation of SPCE water (where the frame write out frequency is 1 ps -- which is comparable to the previous video). Also, even with the limited diffusion, is it expected that the water molecules should translate in 'unison'? I do not observe this behavior in the MM water models I have simulated (TIP3P, SPCE, and TIP4P05).

For the MM simulations, I am using the same script as shown above, with the removal of the sections about the neural network potential. The timestep is 2 fs, and the write out frequency is 1 ps. If it would be helpful, I can provide my trajectories.

The behavior I observe contrasts what is reported in the reference study, which states that the model produces diffusion coefficients and radial distributions comparable to experiments and AIMD. (The simulations listed there have 5 ps of equilibration and 10 ps of production. However, they use the ASE program for molecular dynamics). Since I observe this behavior throughout the simulation in OpenMM using ANI-1ccx, and this differs from the results reported using ASE, could this be an issue of how the potential is being applied?

https://github.com/openmm/openmm-torch/assets/65908747/3dcaa2b4-e123-49d2-b96c-d6a9f7ef35a0

peastman commented 9 months ago

is it expected that the water molecules should translate in 'unison'?

Yes, if you don't include a CMMotionRemover in your System. The thermostat applies independent random forces to every atom. They add up to a random total force acting on the center of mass. There's nothing to prevent the center of mass from moving; the whole system is translationally invariant. So it drifts with time. Most simulations include a CMMotionRemover to prevent it.

Once you do that, it should be clearer what's happening with the individual molecules, whether they diffuse as expected.

openmm / openmm-torch

Clarification of water box simulation setup #132