RuntimeError: CUDA error: no kernel image is available for execution on the device

Tianyi20 commented 1 week ago

Hi, following all instructions you provided, i smoothly go to the final demo. However, after i run:

python demo.py --demo /home/tianyi/pose_estimation/src/CenterPose/images/CenterPose/cup --arch dlav1_34 --load_model /home/tianyi/pose_estimation/src/CenterPose/models/cup_mug_v1_140.pth

(CenterPose) tianyi@tianyi-Redmi-G-2022:~/pose_estimation/src/CenterPose/src$ python demo.py --demo /home/tianyi/pose_estimation/src/CenterPose/images/CenterPose/cup --arch dlav1_34 --load_model /home/tianyi/pose_estimation/src/CenterPose/models/cup_mug_v1_140.pth
/home/tianyi/anaconda3/envs/CenterPose/lib/python3.6/site-packages/sklearn/utils/linear_assignment_.py:22: FutureWarning: The linear_assignment_ module is deprecated in 0.21 and will be removed from 0.23. Use scipy.optimize.linear_sum_assignment instead.
  FutureWarning)
Fix size testing.
training chunk_sizes: [1]
The output will be saved to  /home/tianyi/pose_estimation/src/CenterPose/src/lib/../../exp/object_pose/default
heads {'hm': 1, 'wh': 2, 'hps': 16, 'reg': 2, 'hm_hp': 8, 'hp_offset': 2, 'scale': 3}
Creating model...
loaded /home/tianyi/pose_estimation/src/CenterPose/models/cup_mug_v1_140.pth, epoch 140
THCudaCheck FAIL file=/pytorch/aten/src/THC/THCGeneral.cpp line=383 error=8 : invalid device function
Traceback (most recent call last):
  File "demo.py", line 156, in <module>
    demo(opt, meta)
  File "demo.py", line 83, in demo
    ret = detector.run(image_name, meta_inp=meta)
  File "/home/tianyi/pose_estimation/src/CenterPose/src/lib/detectors/base_detector.py", line 474, in run
    images, self.pre_images, pre_hms, pre_hm_hp, pre_inds, return_time=True)
  File "/home/tianyi/pose_estimation/src/CenterPose/src/lib/detectors/object_pose.py", line 135, in process
    output = self.model(images, pre_images, pre_hms, pre_hm_hp)[-1]
  File "/home/tianyi/anaconda3/envs/CenterPose/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/tianyi/pose_estimation/src/CenterPose/src/lib/models/networks/pose_dla_dcn.py", line 528, in forward
    x = self.base(x)
  File "/home/tianyi/anaconda3/envs/CenterPose/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/tianyi/pose_estimation/src/CenterPose/src/lib/models/networks/pose_dla_dcn.py", line 312, in forward
    x = self.base_layer(x)
  File "/home/tianyi/anaconda3/envs/CenterPose/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/tianyi/anaconda3/envs/CenterPose/lib/python3.6/site-packages/torch/nn/modules/container.py", line 92, in forward
    input = module(input)
  File "/home/tianyi/anaconda3/envs/CenterPose/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/tianyi/anaconda3/envs/CenterPose/lib/python3.6/site-packages/torch/nn/modules/activation.py", line 99, in forward
    return F.relu(input, inplace=self.inplace)
  File "/home/tianyi/anaconda3/envs/CenterPose/lib/python3.6/site-packages/torch/nn/functional.py", line 941, in relu
    result = torch.relu_(input)
RuntimeError: CUDA error: no kernel image is available for execution on the device

Then, i got this output. I did a search, it said it is caused by the mismatch of GPU and cuda. Then i reinstalled the cuda. And try some examples and test if cuda could work.

Screenshot from 2024-11-20 20-07-33

However, everything works well! But then i try the demo.py again. It still doesn't work.

Tianyi20 commented 1 week ago

I just solved the question by using correct cuda, with corresponding torch, however, the torch has higher version. So i change the DCNv2 to be the latest: https://github.com/lucasjinreal/DCNv2_latest. By

git clone https://github.com/lucasjinreal/DCNv2_latest.git

Then substitue it with the original DCNv2 folder. Note here here we go into the folder, and use $ python3 setup.py build develop instead of original ./make.sh .

python3 setup.py build develop

After that, i should be fine. You can directly to run demo.py.

Tianyi20 commented 1 week ago

@Uio96 Can you add my comments to the readme?

NVlabs / CenterPose

RuntimeError: CUDA error: no kernel image is available for execution on the device #29