PaddlePaddle / PaddleDetection

Object Detection toolkit based on PaddlePaddle. It supports object detection, instance segmentation, multiple object tracking and real-time multi-person keypoint detection.
Apache License 2.0
12.81k stars 2.89k forks source link

rtdetr训练不了 #8405

Open Waynepoo opened 1 year ago

Waynepoo commented 1 year ago

问题确认 Search before asking

请提出你的问题 Please ask your question

training on single-GPU

export CUDA_VISIBLE_DEVICES=0 python tools/train.py -c configs/rtdetr/rtdetr_r50vd_6x_coco.yml --eval 报错如下: File "/usr/local/lib/python3.7/dist-packages/decorator.py", line 232, in fun return caller(func, *(extras + args), kw) File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/wrapped_decorator.py", line 25, in impl return wrapped_func(*args, *kwargs) File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/framework.py", line 434, in impl return func(args, kwargs) File "/usr/local/lib/python3.7/dist-packages/paddle/tensor/creation.py", line 189, in to_tensor stop_gradient=stop_gradient) OSError: (External) CUDA error(719), unspecified launch failure. [Hint: 'cudaErrorLaunchFailure'. An exception occurred on the device while executing a kernel. Common causes include dereferencing an invalid device pointerand accessing out of bounds shared memory. Less common cases can be system specific - more information about these cases canbe found in the system specific user guide. This leaves the process in an inconsistent state and any further CUDA work willreturn the same error. To continue using CUDA, the process must be terminated and relaunched.] (at /paddle/paddle/phi/backends/gpu/cuda/cuda_info.cc:258) 环境:尝试过自己安装官方给的步骤安装、docker镜像:paddlecloud/paddledetection:2.4-gpu-cuda11.2-cudnn8-latest、paddlecloud/paddledetection:2.4-gpu-cuda10.2-cudnn7-latest,三种方式报错都一样

lyuwenyu commented 1 year ago

是不是和你的cuda版本不一致?