Open kryczko opened 2 years ago
Similar issue, probably related?
num_repeats = torch.where(pbc, num_repeats, num_repeats.new_zeros(()))
~~~~~~~~~~~ <--- HERE
r1 = torch.arange(1, num_repeats[0].item() + 1, device=cell.device)
r2 = torch.arange(1, num_repeats[1].item() + 1, device=cell.device)
RuntimeError: Expected condition, x and y to be on the same device, but condition is on cpu and x and y are on cuda:0 and cuda:0 respectively
Clean conda environment on Ubuntu, installed packages:
openmm 7.7.0 py39h792354b_0 conda-forge
openmm-torch 0.5 cuda112py39hb628e3f_0 conda-forge
openmmml 1.0 pypi_0 pypi
pytorch 1.10.0 cuda112py39h3ad47f5_1 conda-forge
pytorch-gpu 1.10.0 cuda112py39h0bbbad9_1 conda-forge
torchani 2.2.3.dev2+g3dfbaf4 pypi_0 pypi
Hi, thanks for the report! Could you provide a minimal example to reproduce this?
It might be more suitable for a separate issue since I'm using an openmm stack.
See the full output of the code here:
https://github.com/meyresearch/ANI-Peptides/blob/main/demos/ANI_minimal.ipynb
conda install -c conda-forge openmm openmm-torch pytorch cudatoolkit=11.5
CUDA_HOME
to /usr/local/cuda
and add /usr/local/cuda
to PATH
git clone https://github.com/aiqm/torchani
cd torchani
python setup.py install --cuaev
git clone https://github.com/openmm/openmm-ml
pip install openmm-ml/.
wget -q https://github.com/meyresearch/ANI-Peptides/raw/main/pdbs/aaa.pdb
# Import libraries
from openmm.app import *
from openmm import *
from openmm.unit import *
from openmmml import MLPotential
import sys
# Setup
pdb = PDBFile("aaa.pdb")
potential = MLPotential('ani2x')
system = potential.createSystem(pdb.topology)
integrator = LangevinIntegrator(
300 * kelvin,
1 / picosecond,
1.0 * femtosecond,
)
simulation = Simulation(
pdb.topology,
system,
integrator,
Platform.getPlatformByName("CUDA"),
)
simulation.context.setPositions(pdb.positions)
# Minimize and run
simulation.minimizeEnergy()
simulation.step(1000)
print("done")
Hi, the error came from the openmm-ml wrapper. A temp fixed version work ONLY for GPU could be found at: https://github.com/yueyericardo/openmm-ml/commit/1d1d3f24f40becdcd8a36431c8d0900d98eb1304#diff-911692ca194bf903c77d038662969ad3277dcf2fa8b3b3048d95a5aa3af59de1
It is using cuaev use_cuda_extension
for aev calculation, but it currently does not support pbc, so if you want to use cuaev, you have to change your script slightly to
pdb = PDBFile("aaa.pdb")
# add this line
pdb.topology.setPeriodicBoxVectors(None)
potential = MLPotential('ani2x')
Our internal version has some other updates to make it faster, but it currently is not open source yet. In the meanwhile, openmm team is building NNPOPS for ani and schnet, you could track the progress here Add example of using NNPOps with openmm-torch?!
Edit: BTW, our conda-forge package includes the latest public build with cuaev: you could install it directly by
conda install -c conda-forge torchani
Fantastic! Thank you for looking into this and getting back to me so quickly.
I am still getting the same issue I showed above while using an ANI model within pytorch lightning. Any ideas how to fix it?
I am trying to define an ANI model along with the AEVComputer (with cuda enabled) module within a Pytorch Lightning Module, but I am getting the following error:
I have seen that some of the parameters are registered as buffers, but some are not. Please let me know what you think.
Kev