openmm / NNPOps

High-performance operations for neural network potentials
Other
81 stars 17 forks source link

getNeighborPairs error #68

Closed davkovacs closed 1 year ago

davkovacs commented 2 years ago

I am trying to use the new neighbourlist with pytorch 1.12.1. I have managed to install NNPOps via conda.

If I check the PyTorch version it is 1.12, so that is fine.

If I try to import NNPOps that is also fine:

>>> from NNPOps.neighbors import getNeighborPairs

And it is event telling me it needs two positional arguments:

>>> getNeighborPairs()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: getNeighborPairs() missing 2 required positional arguments: 'positions' and 'cutoff'

But when I actually specify those positional arguments I am getting the following error:

>>> getNeighborPairs(positions=positions, cutoff=cutoff)
Traceback (most recent call last):
  File "/home/dpk25/.conda/envs/MACE/lib/python3.10/site-packages/torch/_ops.py", line 198, in __getattr__
    op, overload_names = torch._C._jit_get_operation(qualified_op_name)
RuntimeError: No such operator neighbors::getNeighborPairs

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/dpk25/.conda/envs/MACE/lib/python3.10/site-packages/NNPOps/neighbors/getNeighborPairs.py", line 82, in getNeighborPairs
    return ops.neighbors.getNeighborPairs(positions, cutoff, max_num_neighbors)
  File "/home/dpk25/.conda/envs/MACE/lib/python3.10/site-packages/torch/_ops.py", line 202, in __getattr__
    raise AttributeError(f"'_OpNamespace' object has no attribute '{op_name}'") from e
AttributeError: '_OpNamespace' object has no attribute 'getNeighborPairs'

Does anyone perhaps know what might be going wrong?

A few relevant info about the environment:

python                    3.10.4               h12debd9_0  
torch                     1.12.1+cu113             pypi_0    pypi

Interestingly, NNPOps does not appear when I do conda list.

davkovacs commented 2 years ago

I think the problem might be torch 1.12, for some reason the NNPOps still seems to only support 1.11 . My code sadly only works with 1.12. Would it be possible to change that in environment.yml ?

channels:
  - conda-forge
dependencies:
  - cmake >=3.20
  - cudatoolkit 11.2.2
  - gxx_linux-64 10.3.0
  - make
  - mdtraj
  - nvcc_linux-64 11.2
  - torchani 2.2.2
  - pytest
  - python 3.10.*
  - pytorch-gpu 1.11.0
  - sysroot_linux-64 2.17
RaulPPelaez commented 1 year ago

I cannot reproduce this. NNPOps should work with 1.12 now, though.

RaulPPelaez commented 1 year ago

I have actually been cursed with this. I have found that manually loading the .so in getNeighborPairs.py fixes it:

from torch import  ops
import site
ops.load_library(site.getsitepackages()[-1]+"/NNPOps/libNNPOpsPyTorch.so")

It is strange, because while I observe the behavior described by @davkovacs, running pytest works. Pytest is able to load and call getNeighborPairs during the tests, but I cannot do so from a python script such as this one:

import torch as pt
import numpy as np
import pytest
from NNPOps.neighbors import getNeighborPairs

device = 'cuda:0'
dtype = pt.float32
num_atoms = 100
# Generate random positions
positions = 10 * pt.randn((num_atoms, 3), device=device, dtype=dtype)
cutoff = float(5.0)
num_neighbors = num_atoms*num_atoms
neighbors, deltas, distances = getNeighborPairs(positions, cutoff=cutoff, max_num_neighbors=num_neighbors)

I have no idea of what causes this or how to reproduce it.

peastman commented 1 year ago

I think we just need to move the code to load the library?

https://github.com/openmm/NNPOps/blob/16543f913b230363409986875a5f479708bf24d0/src/pytorch/SymmetryFunctions.py#L29-L30

It gets repeated in SymmetryFunction.py, CFConv.py, and BatchedNN.py. It doesn't appear in getNeighborPairs.py, so the library doesn't get loaded in that case. How about loading the library once, in the top level __init__.py?

RaulPPelaez commented 1 year ago

Ohhh so this is why the tests run, they must import one of those.

How about loading the library once, in the top level init.py?

I do not know what the standard knowledge is on this, but that seems sensible. Let me test and I will PR.