Closed dominicrufa closed 1 year ago
issue persists with periodic_table_index=False
, as well. previously forgot to remove the edit
@raimis is the best person to comment on the implementation. One suggestion I'd make is that we probably don't want a flag for which implementation to use. If the optimized one is available, we should just use it automatically. I don't think there's ever a reason not to?
One suggestion I'd make is that we probably don't want a flag for which implementation to use. If the optimized one is available, we should just use it automatically. I don't think there's ever a reason not to?
We will presumably need a way to test the optimized implementation against the non-optimized one or use the non-optimized one if circumstances require it, similar to how we can elect to use the Reference
platform in OpenMM instead of more performant versions if desired.
@dominicrufa I have created a draft of a tutorial (https://github.com/openmm/openmm-torch/pull/62). This should give an example how to use NNPOps.
You can see it better here (https://github.com/raimis/openmm-torch/blob/example/tutorials/openmm-torch-nnpops.ipynb) or just
@dominicrufa I have created a draft of a tutorial (openmm/openmm-torch#62). This should give an example how to use NNPOps.
You can see it better here (https://github.com/raimis/openmm-torch/blob/example/tutorials/openmm-torch-nnpops.ipynb) or just
ah, I think i was just treating the atomic_numbers
field wrong. I think i have it working now. will update soon.
alright, so I can now integrate NNPops
into TorchANI
. Interestingly, equipping NNPOPS has a cost of ~0.11s/MD step whereas omitting NNPOPS costs ~0.03s/MD step (script below). as a reference, MM-only MD costs ~0.0004s/MD step:
1 #!/usr/bin/env python
2 import torch
3 import torchani
4
5 #from NNPOps.SpeciesConverter import TorchANISpeciesConverter
6 #from NNPOps.SymmetryFunctions import TorchANISymmetryFunctions
7 #from NNPOps.BatchedNN import TorchANIBatchedNN
8 #from NNPOps.EnergyShifter import TorchANIEnergyShifter
9 from NNPOps import OptimizedTorchANI
10
11 from openmmtools.testsystems import HostGuestExplicit
12 from openmmtools.integrators import LangevinIntegrator
13 from openmmml.mlpotential import MLPotential
14 from simtk import openmm, unit
15 import time
16 import numpy as np
17
18
19 device = torch.device('cuda')
20
21 hgv = HostGuestExplicit(constraints=None)
22
23 potential = MLPotential('ani2x')
24 system = potential.createMixedSystem(hgv.topology, system = hgv.system, atoms = range(126,156), use_OptimizedTorchANI = True)
25 print(f"done making system")
26
27
28
29 _int = LangevinIntegrator()
30 context = openmm.Context(system, _int)
31 context.setPositions(hgv.positions)
32 print(f"unminimized pe: {context.getState(getEnergy=True).getPotentialEnergy()}")
33 openmm.LocalEnergyMinimizer.minimize(context, maxIterations = 100)
34 context.setVelocitiesToTemperature(298.15*unit.kelvin)
35
36 #timer
37 timer = []
38 for i in range(10):
39 start_time = time.time()
40 _int.step(100)
41 print(context.getState(getEnergy=True).getPotentialEnergy())
42 timer.append(time.time() - start_time)
43
44
45 print(np.mean(timer), np.std(timer))
@dominicrufa I tried you script.
On GPU, it crashes:
Traceback (most recent call last):
File "/home/user/tmp/ml.py", line 38, in <module>
_int.step(100)
File "/home/user/conda/lib/python3.9/site-packages/openmm/openmm.py", line 7788, in step
return _openmm.CustomIntegrator_step(self, steps)
openmm.OpenMMException: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
RuntimeError: CUDA driver error: invalid resource handle
On CPU, it runs ~0.02 s/MD step for both cases. This is reasonable, because NNPOps.BatchedNN
is quite inefficient on CPU. It trades memory bandwidth for speed, which CPUs don't have a lot.
@raimis I modified the bench script, fixing the bug you encountered
import torch
import torchani
#from NNPOps.SpeciesConverter import TorchANISpeciesConverter
#from NNPOps.SymmetryFunctions import TorchANISymmetryFunctions
#from NNPOps.BatchedNN import TorchANIBatchedNN
#from NNPOps.EnergyShifter import TorchANIEnergyShifter
from NNPOps import OptimizedTorchANI
from openmmtools.testsystems import HostGuestExplicit
from openmmtools.integrators import LangevinIntegrator
from openmmml.mlpotential import MLPotential
from simtk import openmm, unit
import time
import numpy as np
device = torch.device('cuda')
hgv = HostGuestExplicit(constraints=None)
potential = MLPotential('ani2x')
system = potential.createMixedSystem(hgv.topology, system = hgv.system, atoms = range(126,156), use_OptimizedTorchANI = True)
print(f"done making system")
# was causing error: RuntimeError: CUDA driver error: invalid resource handle
# _int = LangevinIntegrator()
_int = openmm.LangevinIntegrator(
300 * unit.kelvin,
1 / unit.picosecond,
1.0 * unit.femtosecond,
)
context = openmm.Context(system, _int)
context.setPositions(hgv.positions)
print(f"unminimized pe: {context.getState(getEnergy=True).getPotentialEnergy()}")
openmm.LocalEnergyMinimizer.minimize(context, maxIterations = 100)
# was causing error: openmm.OpenMMException: The autograd engine was called while holding the GIL.
# context.setVelocitiesToTemperature(298.15*unit.kelvin)
#timer
timer = []
for i in range(10):
start_time = time.time()
_int.step(100)
print(context.getState(getEnergy=True).getPotentialEnergy())
timer.append(time.time() - start_time)
print(np.mean(timer), np.std(timer))
I never encountered that bug on GPU; the nnpops
i was using was mamba install -c mmh nnpops
since mike henry is trying to place his version on conda-forge
. @raimis, is there an env yaml you can send me to reproduce the error?. mine is attached
nnpops.txt
.
@raimis I modified the bench script, fixing the bug you encountered
import torch import torchani #from NNPOps.SpeciesConverter import TorchANISpeciesConverter #from NNPOps.SymmetryFunctions import TorchANISymmetryFunctions #from NNPOps.BatchedNN import TorchANIBatchedNN #from NNPOps.EnergyShifter import TorchANIEnergyShifter from NNPOps import OptimizedTorchANI from openmmtools.testsystems import HostGuestExplicit from openmmtools.integrators import LangevinIntegrator from openmmml.mlpotential import MLPotential from simtk import openmm, unit import time import numpy as np device = torch.device('cuda') hgv = HostGuestExplicit(constraints=None) potential = MLPotential('ani2x') system = potential.createMixedSystem(hgv.topology, system = hgv.system, atoms = range(126,156), use_OptimizedTorchANI = True) print(f"done making system") # was causing error: RuntimeError: CUDA driver error: invalid resource handle # _int = LangevinIntegrator() _int = openmm.LangevinIntegrator( 300 * unit.kelvin, 1 / unit.picosecond, 1.0 * unit.femtosecond, ) context = openmm.Context(system, _int) context.setPositions(hgv.positions) print(f"unminimized pe: {context.getState(getEnergy=True).getPotentialEnergy()}") openmm.LocalEnergyMinimizer.minimize(context, maxIterations = 100) # was causing error: openmm.OpenMMException: The autograd engine was called while holding the GIL. # context.setVelocitiesToTemperature(298.15*unit.kelvin) #timer timer = [] for i in range(10): start_time = time.time() _int.step(100) print(context.getState(getEnergy=True).getPotentialEnergy()) timer.append(time.time() - start_time) print(np.mean(timer), np.std(timer))
also, i might be missing something, but if openmmtools.integrators.LangevinIntegrator
(or any CustomIntegrator
objects) is not compatible with NNPops, this might cause more problems downstream.
That's also how I installed mine, though I wouldn't put it past my setup to misbehave!
But I think you're right, CustomIntegrator
s are triggering an issue here.
I missed the details---what is the issue with CustomIntegrator
?
And do I understand @dominicrufa correctly that NNPOps and the optimized TorchANI makes things 5x slower, rather than 5x faster?
And are these single ANI models, or ensembles?
CustomIntegrator
bugUsing openmmtools.integrators.LangevinIntegrator
rather than manually configuring an openmm.LangevinIntegrator
was triggering this error: RuntimeError: CUDA driver error: invalid resource handle
NNPops optimised ANI is significantly faster for me.
For a 30 atom unsolvated system, I get about 5ns/day with unoptimised torchani. The same system with optimised ANI reaches 30ns/day
@not-matt thanks for your effort.
# was causing error: RuntimeError: CUDA driver error: invalid resource handle
# _int = LangevinIntegrator()
I guess this might be a some incompatibility between OpenMM-Torch and OpenMM-Tools. Could you try to reduce the script to a minimum to trigger the issue and create a separate issue. For a moment using openmm.LangevinIntegrator
is viable solutions.
# was causing error: openmm.OpenMMException: The autograd engine was called while holding the GIL.
# context.setVelocitiesToTemperature(298.15*unit.kelvin)
This is already fixed by https://github.com/openmm/openmm/pull/3424, we just need to release OpenMM 7.7.1.
I've updated the nnpops package in my channel to 0.2
https://anaconda.org/mmh/nnpops/files
I've now got a mix of different cuda/python packages built. Still waiting on a review from the conda-forge people but conda update nnpops
should pull in the latest version now.
I've updated the nnpops package in my channel to 0.2 https://anaconda.org/mmh/nnpops/files I've now got a mix of different cuda/python packages built. Still waiting on a review from the conda-forge people but
conda update nnpops
should pull in the latest version now.
@mikemhenry , did you replicate the code snippet with your conda installation?
I forgot I was going to try that, can you link the code snippet? There are several in this thread and I want to make sure to test the right one @dominicrufa
I forgot I was going to try that, can you link the code snippet? There are several in this thread and I want to make sure to test the right one @dominicrufa
https://github.com/openmm/openmm-ml/pull/20#issue-1117917500. if it fails, (which was reported by others in this thread), then try this: https://github.com/openmm/openmm-ml/pull/20#issuecomment-1029053808
Using the version from my channel
mamba update nnpops -c mmh
Which pulls in nnpops 0.2 cuda112py39h453d82a_0 mmh/linux-64 493 KB
Both snippets worked for me.
With the second snippet https://github.com/openmm/openmm-ml/pull/20#issuecomment-1029053808
I got a mean time of 2.0347s per 100 steps (0.083s std)
With use_OptimizedTorchANI = False
1.9108s per 100 steps (0.0753s std)
(This is on my laptop, and I'm not sure if /home/mmh/miniconda3/envs/nnpops-private/lib/python3.9/site-packages/torchani/__init__.py:55: UserWarning: Dependency not satisfied, torchani.ase will not be available
is an issue or not, so take this with a grain of salt, I'm mostly testing the snippet to make sure it works)
are there any reasons why conda installs would yield such differences in performance? /with/without errors?
Lots of reasons, the perforce differences are likely hardware differences, I ran the script on an NVIDIA GeForce RTX 2060 card on my laptop, so the GPU was also busy running my xserver. The errors I'm less sure about. I did check I was using a GPU, with CPU I get 5.669s With use_OptimizedTorchANI = False
and 6.3206s with use_OptimizedTorchANI = True
@dominicrufa : Is this something you can work with @mikemhenry on interactively on lilac to debug? Has anyone tried this on google colab, for example?
@dominicrufa : Is this something you can work with @mikemhenry on interactively on lilac to debug? Has anyone tried this on google colab, for example?
if this is a hardware/conda versions issue, i don't know how i would go about debugging this. my env yaml is posted here. based on @raimis 's tutorial, google colab seems to have different performance than what mike and i see.
@peastman , I implemented the changes/tests to the NNPOps
implementation discussed here. I parameterized the unittest here, and while I can get the CUDA-disabled test to pass (using pytest
), running CUDA
with the nnpops
implementation still throws a openmm.OpenMMException: Error invoking kernel: CUDA_ERROR_INVALID_HANDLE (400)
exception with the latest conda-installable nightly build of openmm.
Can you list which CUDA toolkit versions of these packages you installed, and which node/driver you were running on? I think we often see that when there is a mismatch between CUDA build version and driver version.
Can you list which CUDA toolkit versions of these packages you installed, and which node/driver you were running on? I think we often see that when there is a mismatch between CUDA build version and driver version.
cudatoolkit=11.3.1. using lt node. i don't suspect that being the problem since i can manually equip a TorchForce
with nnpops
without openmm-ml
and it will run on a gpu just fine.
The CUDA Toolkit release notes state:
Each release of the CUDA Toolkit requires a minimum version of the CUDA driver. The CUDA driver is backward compatible, meaning that applications compiled against a particular version of the CUDA will continue to work on subsequent (later) driver releases.
CUDA 11.3 requires >=450.80.02
, and it looks like lt20
has 465.19.01:
$ nvidia-smi
Fri Apr 8 20:42:13 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 465.19.01 Driver Version: 465.19.01 CUDA Version: 11.3 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... On | 00000000:84:00.0 Off | N/A |
| 28% 32C P8 9W / 250W | 1MiB / 11178MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
This is superseded by #35.
integrate
NNPOps
intoMLPotential
.I want to be able to integrate
OptimizedTorchANI
directly intoTorchANI
models under the hood when I create a fully or hybrid ani/openmm system.when i run the following:
with the new modification, I see that
but not without the modification. I am not modifying the
species
attr, so I'm not sure where this is coming from.