CuDNN error while running predict.sh

TriptSharma commented 1 year ago

I am running it on a NVIDIA RTX 3080 GPU

Traceback (most recent call last):
File "/home/se3_tracknet/predict.py", line 640, in <module>
predictSequenceMyData()
File "/home/se3_tracknet/predict.py", line 591, in predictSequenceMyData
cur_pose = tracker.on_track(A_in_cam, rgb, depth, gt_A_in_cam=np.eye(4),gt_B_in_cam=np.eye(4), debug=debug,samples=samples)
File "/home/se3_tracknet/predict.py", line 252, in on_track
prediction = self.model(dataA,dataB)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 547, in __call__
result = self.forward(*input, **kwargs)
File "/home/se3_tracknet/se3_tracknet.py", line 84, in forward
a = self.convA1(A)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 547, in __call__
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/container.py", line 92, in forward
input = module(input)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 547, in __call__
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py", line 343, in forward
return self.conv2d_forward(input, self.weight)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py", line 340, in conv2d_forward
self.padding, self.dilation, self.groups)
RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED

The following is the output of "nvidia-smi"


Wed Oct 26 23:54:26 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.73.05 Driver Version: 510.73.05 CUDA Version: 11.6 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:01:00.0 On | N/A |
| 43% 49C P2 100W / 370W | 1624MiB / 10240MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
+-----------------------------------------------------------------------------+```

Chunsheng13 commented 1 year ago

I get the same error as you ,have you sove it

wenbowen123 commented 1 year ago

@Chunsheng13 @TriptSharma It maybe due to pytorch version not compatible with your CUDA version. Can you try to install a more recent pytorch inside the docker?

wenbowen123 / iros20-6d-pose-tracking

CuDNN error while running predict.sh #49