Jingkang50 / OpenPSG

Benchmarking Panoptic Scene Graph Generation (PSG), ECCV'22
https://psgdataset.org
MIT License
407 stars 68 forks source link

RuntimeError:unable to find a valid cuDNN algorithm to convolution #110

Open HWH-2019 opened 9 months ago

HWH-2019 commented 9 months ago

First, when I train with multiple GPUs on one machine, I meet the error as follows: RuntimeError: CUDA error: an illegal memory access was encountered

when I set the CUDA_LAUNCH_BLOCKING=1, I got more information about this error:

RuntimeError: unable to find a valid cuDNN algorithm to convolution

I found that someone has encountered this problem, but there is no good solution, and I noticed that this problem is not encountered when training the MOTIFS model.

Is it a problem with the PSGFormer model code itself or the GPUs without enough VRAM as mentioned on the Internet?

and why I didn't get the error: CUDA out of memory

the run-time environment as follows:

# system
GPU RTX3090
cuda 12.0
cudnn 11.8

# run-time
pytorch==1.7.1
torchvision==0.8.2
torchaudio==0.7.2
cudatoolkit=11.0
Jingkang50 commented 9 months ago

Sorry I cannot come up with solution to the problem. Have you solve the problem?