xingyizhou / CenterNet

Object detection, 3D detection, and pose estimation using center point detection:
MIT License
7.28k stars 1.93k forks source link

RuntimeError: Not compiled with GPU support after reboot #943

Open jatinkatyal opened 3 years ago

jatinkatyal commented 3 years ago

I got the thing working after following the original steps, except instead of using DCNv2 for pytorch 0.4 I used the master branch for pytorch 1.x. Tested DCNv2, it works as expected. Ran CenterNet on COCO using GETTING_STARTED.md it also worked.

rebooted the system, now DCNv2 works but running CenterNet/src/test.py gives error as below.

Each time I have to create a new environment and install the requirements.txt and make DCNv2 to get things working again.

Traceback (most recent call last):
  File "test.py", line 126, in <module>
    prefetch_test(opt)
  File "test.py", line 70, in prefetch_test
    ret = detector.run(pre_processed_images)
  File "/home/jatin/InternalHDD/Work/computerVision/CenterNet/src/lib/detectors/base_detector.py", line 116, in run
    output, dets, forward_time = self.process(images, return_time=True)
  File "/home/jatin/InternalHDD/Work/computerVision/CenterNet/src/lib/detectors/ctdet.py", line 30, in process
    output = self.model(images)[-1]
  File "/home/jatin/anaconda3/envs/CenterNet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/jatin/InternalHDD/Work/computerVision/CenterNet/src/lib/models/networks/pose_dla_dcn.py", line 472, in forward
    x = self.dla_up(x)
  File "/home/jatin/anaconda3/envs/CenterNet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/jatin/InternalHDD/Work/computerVision/CenterNet/src/lib/models/networks/pose_dla_dcn.py", line 411, in forward
    ida(layers, len(layers) -i - 2, len(layers))
  File "/home/jatin/anaconda3/envs/CenterNet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/jatin/InternalHDD/Work/computerVision/CenterNet/src/lib/models/networks/pose_dla_dcn.py", line 384, in forward
    layers[i] = upsample(project(layers[i]))
  File "/home/jatin/anaconda3/envs/CenterNet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/jatin/InternalHDD/Work/computerVision/CenterNet/src/lib/models/networks/pose_dla_dcn.py", line 355, in forward
    x = self.conv(x)
  File "/home/jatin/anaconda3/envs/CenterNet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/jatin/InternalHDD/Work/computerVision/CenterNet/src/lib/models/networks/DCNv2/dcn_v2.py", line 128, in forward
    self.deformable_groups)
  File "/home/jatin/InternalHDD/Work/computerVision/CenterNet/src/lib/models/networks/DCNv2/dcn_v2.py", line 31, in forward
    ctx.deformable_groups)
RuntimeError: Not compiled with GPU support (dcn_v2_forward at /home/jatin/InternalHDD/Work/computerVision/CenterNet/src/lib/models/networks/DCNv2/src/dcn_v2.h:35)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x47 (0x7ff2d1b93627 in /home/jatin/anaconda3/envs/CenterNet/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0xb339 (0x7ff2b7c08339 in /home/jatin/.local/lib/python3.6/site-packages/DCNv2-0.1-py3.6-linux-x86_64.egg/_ext.cpython-36m-x86_64-linux-gnu.so)
frame #2: <unknown function> + 0x1da2f (0x7ff2b7c1aa2f in /home/jatin/.local/lib/python3.6/site-packages/DCNv2-0.1-py3.6-linux-x86_64.egg/_ext.cpython-36m-x86_64-linux-gnu.so)
frame #3: <unknown function> + 0x1af91 (0x7ff2b7c17f91 in /home/jatin/.local/lib/python3.6/site-packages/DCNv2-0.1-py3.6-linux-x86_64.egg/_ext.cpython-36m-x86_64-linux-gnu.so)
<omitting python frames>
frame #9: THPFunction_apply(_object*, _object*) + 0xa1f (0x7ff30943be3f in /home/jatin/anaconda3/envs/CenterNet/lib/python3.6/site-packages/torch/lib/libtorch_python.so)

Python 3.6.13 torch==1.4.0

$ nvidia-smi
Wed Oct 13 00:33:32 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.57.02    Driver Version: 470.57.02    CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  On   | 00000000:10:00.0  On |                  N/A |
|  0%   42C    P8    13W / 120W |    535MiB /  5941MiB |      9%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1188      G   /usr/lib/xorg/Xorg                 35MiB |
|    0   N/A  N/A      1780      G   /usr/lib/xorg/Xorg                154MiB |
|    0   N/A  N/A      1909      G   /usr/bin/gnome-shell              201MiB |
|    0   N/A  N/A      3587      G   /usr/lib/firefox/firefox          133MiB |
+-----------------------------------------------------------------------------+
$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Sun_Aug_15_21:14:11_PDT_2021
Cuda compilation tools, release 11.4, V11.4.120
Build cuda_11.4.r11.4/compiler.30300941_0
sicarioakki commented 2 years ago

Hi, have you solved this issue? I am getting the same error.

sicarioakki commented 2 years ago

I get the error while running demo.py