openmm / openmm-ml

High level API for using machine learning models in OpenMM simulations
Other
75 stars 25 forks source link

How to properly set up Periodic Box? #42

Closed kexul closed 1 year ago

kexul commented 1 year ago

Hi, I'm trying to run molecular simulation using ML potential. Here is the code I used:

from sys import stdout
from openmmml import MLPotential
from openmm.app import *
from openmm import *
from openmm.unit import *
from openmm.unit import picosecond, picoseconds

pdb = PDBFile('input_new.pdb')

potential = MLPotential('ani2x')
system = potential.createSystem(pdb.topology)
integrator = LangevinIntegrator(300*kelvin, 1/picosecond, 0.002*picoseconds)
simulation = Simulation(pdb.topology, system, integrator)
simulation.context.setPositions(pdb.positions)
simulation.context.setPeriodicBoxVectors(Vec3(3.0, 0, 0), Vec3(0, 3.0, 0), Vec3(0, 0, 3.0))
simulation.minimizeEnergy()
simulation.reporters.append(PDBReporter('output.pdb', 1000))
simulation.reporters.append(StateDataReporter(stdout, 1000, step=True, potentialEnergy=True, temperature=True))
simulation.step(10000)

input_new.pdb.txt

However, the following error occurred:

/data/miniconda3/envs/opm8/lib/python3.9/site-packages/torchani/__init__.py:55: UserWarning: Dependency not satisfied, torchani.ase will not be available
  warnings.warn("Dependency not satisfied, torchani.ase will not be available")
Warning: importing 'simtk.openmm' is deprecated.  Import 'openmm' instead.
/data/miniconda3/envs/opm8/lib/python3.9/site-packages/torchani/resources/
Traceback (most recent call last):
  File "/data/opm8/bb.py", line 16, in <module>
    simulation.minimizeEnergy()
  File "/data/miniconda3/envs/opm8/lib/python3.9/site-packages/openmm/app/simulation.py", line 137, in minimizeEnergy
    mm.LocalEnergyMinimizer.minimize(self.context, tolerance, maxIterations)
  File "/data/miniconda3/envs/opm8/lib/python3.9/site-packages/openmm/openmm.py", line 8544, in minimize
    return _openmm.LocalEnergyMinimizer_minimize(context, tolerance, maxIterations)
openmm.OpenMMException: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript, serialized code (most recent call last):
  File "code/__torch__/openmmml/models/anipotential.py", line 31, in forward
      _5 = torch.mul(boxvectors1, 10.)
      pbc = self.pbc
      _6, energy1, = (model0).forward(_4, _5, pbc, )
                      ~~~~~~~~~~~~~~~ <--- HERE
      energy = energy1
    energyScale = self.energyScale
  File "code/__torch__/NNPOps/OptimizedTorchANI.py", line 17, in forward
    species_coordinates0 = (species_converter).forward(species_coordinates, None, None, )
    aev_computer = self.aev_computer
    species_aevs = (aev_computer).forward(species_coordinates0, cell, pbc, )
                    ~~~~~~~~~~~~~~~~~~~~~ <--- HERE
    neural_networks = self.neural_networks
    species_energies = (neural_networks).forward(species_aevs, )
  File "code/__torch__/NNPOps/SymmetryFunctions.py", line 37, in forward
      cell0 = cell
    holder = self.holder
    _4 = ops.NNPOpsANISymmetryFunctions.operation(holder, torch.select(positions, 0, 0), cell0)
         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
    radial, angular, = _4
    features = torch.unsqueeze(torch.cat([radial, angular], 1), 0)

Traceback of TorchScript, original code (most recent call last):
  File "/data/miniconda3/envs/opm8/lib/python3.9/site-packages/openmmml/models/anipotential.py", line 125, in forward
                else:
                    boxvectors = boxvectors.to(torch.float32)
                    _, energy = self.model((self.species, 10.0*positions.unsqueeze(0)), cell=10.0*boxvectors, pbc=self.pbc)
                                ~~~~~~~~~~ <--- HERE

                return self.energyScale*energy
  File "/data/miniconda3/envs/opm8/lib/python3.9/site-packages/NNPOps/OptimizedTorchANI.py", line 52, in forward

        species_coordinates = self.species_converter(species_coordinates)
        species_aevs = self.aev_computer(species_coordinates, cell=cell, pbc=pbc)
                       ~~~~~~~~~~~~~~~~~ <--- HERE
        species_energies = self.neural_networks(species_aevs)
        species_energies = self.energy_shifter(species_energies)
  File "/data/miniconda3/envs/opm8/lib/python3.9/site-packages/NNPOps/SymmetryFunctions.py", line 124, in forward
                    raise ValueError('Only fully periodic systems are supported, i.e. pbc = [True, True, True]')

        radial, angular = operation(self.holder, positions[0], cell)
                          ~~~~~~~~~ <--- HERE
        features = torch.cat((radial, angular), dim=1).unsqueeze(0)

RuntimeError: Encountered error cudaErrorNotSupported at /home/conda/feedstock_root/build_artifacts/nnpops_1658858275941/work/src/ani/CudaANISymmetryFunctions.cu:43

It seems to be telling me that periodic is not set up correctly, but I've set it by

simulation.context.setPeriodicBoxVectors(Vec3(3.0, 0, 0), Vec3(0, 3.0, 0), Vec3(0, 0, 3.0))

following the example here.

Any idea? Thanks!

peastman commented 1 year ago

I don't think this has anything to do with periodic boxes. It's throwing a CUDA error:

RuntimeError: Encountered error cudaErrorNotSupported at /home/conda/feedstock_root/build_artifacts/nnpops_1658858275941/work/src/ani/CudaANISymmetryFunctions.cu:43

That indicates it's performing an operation that isn't supported by your GPU or driver. Line 43 of CudaANISymmetryFunctions.cu is

CHECK_RESULT(cudaMallocManaged(&positions, numAtoms*sizeof(float3)));

Managed memory has been supported by all GPUs for quite a long time. What GPU and driver do you have?

kexul commented 1 year ago

Many thanks for spotting where the true problem is! @peastman I just test the cudaMallocManaged function using this snippet and it throws the cudaErrorNotSupported again.

#include <cuda.h>
#include <cstring>
#include <stdexcept>

using namespace std;

#define CHECK_RESULT(result) \
if (result != cudaSuccess) { \
    throw runtime_error(string("Encountered error ")+cudaGetErrorName(result)+" at "+__FILE__+":"+to_string(__LINE__));\
}

int main() {
        int32_t *A;
        CHECK_RESULT(cudaMallocManaged((void**)&A, sizeof(int32_t)));
        return 0;
}
terminate called after throwing an instance of 'std::runtime_error'
  what():  Encountered error cudaErrorNotSupported at test.cu:14

I'm using a virtualized GPU in a docker container environment, and I'm not clear about the technical detail under the hood, here is the output of my nvidia-smi if you are interested. I'll consult my colleague for more detail about the hardware.

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.102.04   Driver Version: 450.102.04   CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GRID T4-8C          On   | 00000000:00:09.0 Off |                    0 |
| N/A   N/A    P0    N/A /  N/A |   1104MiB /  8192MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

I test my code on another machine with real physic GPU, now it runs fine.