krrish94 / chamferdist

Pytorch package to compute Chamfer distance between point sets (pointclouds).
Other
290 stars 50 forks source link

Error: Not compiled with cuda #10

Open sbharadwajj opened 3 years ago

sbharadwajj commented 3 years ago

Hi,

so I pip install chamferdist, installed via pip. But when I run my training loop where both source and target is in cuda.(), I get the following error:

Traceback (most recent call last):
  File "train.py", line 114, in <module>
    loss_net = chamferDist(pred, gt, bidirectional=True)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/chamferdist/chamfer.py", line 82, in forward
    K=1,
  File "/opt/conda/lib/python3.7/site-packages/chamferdist/chamfer.py", line 267, in knn_points
    p1, p2, lengths1, lengths2, K, version, return_sorted
  File "/opt/conda/lib/python3.7/site-packages/chamferdist/chamfer.py", line 162, in forward
    idx, dists = _C.knn_points_idx(p1, p2, lengths1, lengths2, K, version)
RuntimeError: Not compiled with GPU support.
krrish94 commented 3 years ago

This could be because you aren't using pytorch 1.6?

Compiling from source is the recommended way to install the package

sbharadwajj commented 3 years ago

I am using pytorch 1.6, cuda 10.1

But I will try compiling from source now. I am using a singularity environment thats why I used pip.

krrish94 commented 3 years ago

Ah yes - could also be a cuda error. The pypi package was compiled with CUDA 9.0 iirc

sbharadwajj commented 3 years ago

okay, I dont have a torch 1.6 cuda 9 compatible docker. Will compiling from source work?

krrish94 commented 3 years ago

should work :)

sbharadwajj commented 3 years ago

I am unable to build from source because I dont have root access. Is there anything I can change?

What are the specs for PyPi package? Cuda 9.0 and torch 1.6?

krrish94 commented 3 years ago

As far as I know you wouldn't need root access to build from source. The recommended way is to use a conda / virtualenv environment

AkbarShah96 commented 3 years ago

Hi Krishna,

I was revisiting this repo today and I remember facing a similar issue. I tried it with PyTorch 1.5, 1.6 and CUDA 10.1. I was building from source.

What version of PyTorch / Cuda do you have?

krrish94 commented 3 years ago

I just reran CI on my end and I can confirm the following pytorch and CUDA versions work

pytorch 1.5, 1.6, 1.7 CUDA 9.2, 10.0, 10.1, 10.2

It would help to know more details about the error

AkbarShah96 commented 3 years ago

image

Here is a screenshot of the error I get.

Edit: I also tried building from source on my Windows machine but couldn't get it to build. Attached log file. pytorch: 1.6, CUDA 10.2 and pytorch 1.7, CUDA 11.0 error.log

krrish94 commented 3 years ago

Can you try running example.py from this repo?

AkbarShah96 commented 3 years ago

image

Similar error I guess.

saryazdi commented 3 years ago

@AkbarShah96 From the error log you posted for building from source, it seems like you're building from source with python setup.py build. Try uninstalling chamferdist and removing the build/ folder, then re-installing with pip install .. I remember getting some build errors on windows without pip in the past.

AkbarShah96 commented 3 years ago

@saryazdi I tried your suggestion however I am getting another error. Seems like there might be some additional prerequisite steps to consider. Similar errors also occurred today when I tried to get the chamferdist from PyTorch3D on Windows.

krrish94 commented 3 years ago

Interesting note, thanks. Question: Does the same error occur if you do python setup.py build develop ?

AkbarShah96 commented 3 years ago

error.log

Yes, I think so.

Just for reference: image cuda is available when i launch python in the environment!

Relevent Versions fromconda list: image image

AkbarShah96 commented 3 years ago

Hey guys,

I managed to resolve this issue with the pytorch3d team after debugging for that repo.

There were 3 changes that I made that could have possibly have resolved this issue:

  1. Uninstalled other old versions of CUDA and reinstalled CUDA 10.1. Make sure the CUDA environment variables are set to the correct version!

  2. The CUDA_HOME environment variable was set to C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1 which was by default pointing to C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\bin for me.

  3. For Windows 10 there are some changes that need to be made to the header files in PyTorch. PR#323 (pytorch3d).

Hope it helps someone facing similar issues.

ardianumam commented 2 years ago

I just reran CI on my end and I can confirm the following pytorch and CUDA versions work

pytorch 1.5, 1.6, 1.7 CUDA 9.2, 10.0, 10.1, 10.2

It would help to know more details about the error

Got similar error although my cuda version is 10.2 (one of those versions mentioned above). Here is the error message:

File "/home/user/env_pytorch16/lib/python3.6/site-packages/chamferdist/__init__.py", line 1, in <module>
    from .chamfer import ChamferDistance
  File "/home/user/env_pytorch16/lib/python3.6/site-packages/chamferdist/chamfer.py", line 12, in <module>
    from chamferdist import _C
ImportError: libcudart.so.10.1: cannot open shared object file: No such file or directory

Any suggestion? Many thanks!