Compiling cpp extensions under CUDA 10.0 or higher

andgitchang commented 4 years ago

Since RTX-like graphic cards need a higher version than CUDA 9.0, is it possible to build cpp extensions using CUDA 10.0? Currently I modify this line to cuda-10.0 but get errors exactly like this. Do you have a plan to make lib/csrc compatible with CUDA 10.0? Thanks.

bertid commented 4 years ago

Not sure if that helps, but check that you also edited line 19 of that file, which also has CUDA 9 hard coded.

andgitchang commented 4 years ago

@bertid Sorry about that, I did also edit line 19. Have you ever successfully built cpp extentions with CUDA 10? Look forward to your reply, thank. you.

bertid commented 4 years ago

@andgitchang I was able to build it, but not run it yet. For the nn-module, make sure that nvcc is in your $PATH and libcudart.so.10.x in your LD_LIBRARY_PATH, probably like this:

export PATH=$PATH:$CUDA_HOME/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CUDA_HOME/lib

At runtime, I am running into this issue, though:

$> python run.py --type evaluate --cfg_file configs/linemod.yaml model cat cls_type cat
[...]
THCudaCheck FAIL file=/pytorch/aten/src/THC/THCGeneral.cpp line=383 error=11 : invalid argument

Traceback (most recent call last):
  File "run.py", line 226, in <module>
    globals()['run_'+args.type]()
  File "run.py", line 78, in run_evaluate
    output = network(inp)
  File "/myhome/anaconda3/envs/pvnet/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "lib/networks/pvnet/resnet18.py", line 100, in forward
    self.decode_keypoint(ret)
  File "lib/networks/pvnet/resnet18.py", line 75, in decode_keypoint
    kpt_2d = ransac_voting_layer_v3(mask, vertex, 128, inlier_thresh=0.99, max_num=100)
  File "/myhome/clean-pvnet/lib/csrc/ransac_voting/ransac_voting_gpu.py", line 190, in ransac_voting_layer_v3
    ATA = torch.matmul(normal.permute(0, 2, 1), normal)              # [vn,2,2]
RuntimeError: cublas runtime error : the GPU program failed to execute at /pytorch/aten/src/THC/THCBlas.cu:450

pengsida commented 4 years ago

@bertid You did not load the pretrained model.

bertid commented 4 years ago

@pengsida Thanks - how can I do that? I followed the instructions: downloaded the cat model (cat_199.pth), copied it to data/model/pvnet/cat/199.pth, and ran the command

python run.py --type evaluate --cfg_file configs/linemod.yaml model cat cls_type cat

Here is the full output:

Load model: data/model/pvnet/cat/199.pth
loading annotations into memory...
Done (t=0.11s)
creating index...
index created!
loading annotations into memory...
Done (t=0.06s)
creating index...
index created!
  0%|                                                                                                                                                                                           | 0/1002 [00:00<?, ?it/s]
THCudaCheck FAIL file=/pytorch/aten/src/THC/THCGeneral.cpp line=383 error=11 : invalid argument

Traceback (most recent call last):
  File "run.py", line 226, in <module>
    globals()['run_'+args.type]()
  File "run.py", line 78, in run_evaluate
    output = network(inp)
  File "/mvtec/home/drost/anaconda3/envs/pvnet9/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "lib/networks/pvnet/resnet18.py", line 100, in forward
    self.decode_keypoint(ret)
  File "lib/networks/pvnet/resnet18.py", line 75, in decode_keypoint
    kpt_2d = ransac_voting_layer_v3(mask, vertex, 128, inlier_thresh=0.99, max_num=100)
  File "/import/mvtec/home/drost/work/P_3d_DeepLearning/clean-pvnet/lib/csrc/ransac_voting/ransac_voting_gpu.py", line 190, in ransac_voting_layer_v3
    ATA = torch.matmul(normal.permute(0, 2, 1), normal)              # [vn,2,2]
RuntimeError: cublas runtime error : the GPU program failed to execute at /pytorch/aten/src/THC/THCBlas.cu:450

bertid commented 4 years ago

@andgitchang I gave up with cuda 10 and compiled it with cuda 9 instead. @pengsida The error went away with a different GPU, apparently mine was not cuda 9.0 compatible.

andgitchang commented 4 years ago

I have even not installed cuda 9 yet since the gpus on my machine are RTX. Is the gpu you installed with cuda 9 also RTX? I'm really glad to install cuda 9 if it's compatible among -RTX -cpp extentions. I appreciate your information, @bertid .

bertid commented 4 years ago

@andgitchang I was able to run it with cuda 9 on a GTX 750, but not on a GTX 10 or RTX 20. Sorry, I don't know if you can make cuda 9 work with the RTX, for all I know you'd need cuda 10.

pengsida commented 4 years ago

@andgitchang Maybe I will test the code on cuda 10 two weeks later.

pengsida commented 3 years ago

@bertid @andgitchang I can run the code with cuda 10.0.

mgawlinska commented 3 years ago

@andgitchang I was able to build it, but not run it yet. For the nn-module, make sure that nvcc is in your $PATH and libcudart.so.10.x in your LD_LIBRARY_PATH, probably like this:

export PATH=$PATH:$CUDA_HOME/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CUDA_HOME/lib

At runtime, I am running into this issue, though:

$> python run.py --type evaluate --cfg_file configs/linemod.yaml model cat cls_type cat
[...]
THCudaCheck FAIL file=/pytorch/aten/src/THC/THCGeneral.cpp line=383 error=11 : invalid argument

Traceback (most recent call last):
  File "run.py", line 226, in <module>
    globals()['run_'+args.type]()
  File "run.py", line 78, in run_evaluate
    output = network(inp)
  File "/myhome/anaconda3/envs/pvnet/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "lib/networks/pvnet/resnet18.py", line 100, in forward
    self.decode_keypoint(ret)
  File "lib/networks/pvnet/resnet18.py", line 75, in decode_keypoint
    kpt_2d = ransac_voting_layer_v3(mask, vertex, 128, inlier_thresh=0.99, max_num=100)
  File "/myhome/clean-pvnet/lib/csrc/ransac_voting/ransac_voting_gpu.py", line 190, in ransac_voting_layer_v3
    ATA = torch.matmul(normal.permute(0, 2, 1), normal)              # [vn,2,2]
RuntimeError: cublas runtime error : the GPU program failed to execute at /pytorch/aten/src/THC/THCBlas.cu:450

I have the same problem on cuda-10. I am training a model from scratch so I don't need to load a pre-trained model.

EDIT: The issue was related to pytorch version. After getting pytorch version 1.4.0 everything works fine.

zju3dv / clean-pvnet

Compiling cpp extensions under CUDA 10.0 or higher #144