ThibaultGROUEIX / ChamferDistancePytorch

Chamfer Distance in Pytorch with f-score
MIT License
336 stars 45 forks source link

Error when running forward #10

Closed Steve-Tod closed 4 years ago

Steve-Tod commented 4 years ago

Hi, my code is like this:

from chamfer3D.dist_chamfer_3D import chamfer_3DDist
import torch
nnd = chamfer_3DDist()
pc1 = torch.rand(4, 2048, 3).cuda()
pc2 = torch.rand(4, 2048, 3).cuda()
dist1, dist2, _, _ = nnd(pc1, pc2)

Then I get:

error in nnd updateOutput: no kernel image is available for execution on the device

I'm using pytorch 1.3.1, cuda 10.0 and cudnn 7.6.0

How can I solve this?

ThibaultGROUEIX commented 4 years ago

Hi @Steve-Tod, Did you run the tests? Cheers, Thibault

Steve-Tod commented 4 years ago

Hi, I run the unit_test.py and get the following results:

Jitting Chamfer 2D
Traceback (most recent call last):
  File "unit_test.py", line 2, in <module>
    import chamfer2D.dist_chamfer_2D
  File "/orion/u/jiangthu/projects/DeformationNet/code/model/loss/ChamferDistancePytorch/chamfer2D/dist_chamfer_2D.py", line 15, in <module>
    "/".join(os.path.abspath(__file__).split('/')[:-1] + ["chamfer2D.cu"]),
  File "/orion/u/jiangthu/anaconda3/envs/pytorch1.3_py3/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 658, in load
    build_directory or _get_build_directory(name, verbose),
  File "/orion/u/jiangthu/anaconda3/envs/pytorch1.3_py3/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1015, in _get_build_directory
    os.makedirs(build_directory)
  File "/orion/u/jiangthu/anaconda3/envs/pytorch1.3_py3/lib/python3.7/os.py", line 221, in makedirs
    mkdir(name, mode)
PermissionError: [Errno 13] Permission denied: '/tmp/torch_extensions/chamfer_2D'

BTW I Googled and found the error might be caused by incompatibility between pytorch 1.3.1 and Tesla V100-DGX I'm using.

jih189 commented 4 years ago

@Steve-Tod I think you can try this first. it may solve it. https://github.com/ThibaultGROUEIX/ChamferDistancePytorch/issues/13

Steve-Tod commented 4 years ago

Hi @jih189 , I use conda to create a new env with pytorch 1.3.1. I use cuda 10.0. In this new env, I clone this repo and run unit_test.py, and get 'PermissionError: [Errno 13] Permission denied: '/tmp/torch_extensions/' again.

Steve-Tod commented 4 years ago

Hi guys, I found the cause of the problem. I'm sharing the machine with others. Another person used JIT to compile a torch extension and created the directory /tmp/torch_extensions/. He has the ownership of this directory while I don't, so I don't have the permission to write into it. On another machine where /tmp/torch_extensions/ hasn't been created, everything goes on smoothly.