RuntimeError: CUDA error: CUBLAS_STATUS_ARCH_MISMATCH

I've encountered an issue when run 'torchpack dist-run -np 1 python tools/test.py configs/nuscenes/det/transfusion/secfpn/camera+lidar/swint_v0p075/convfuser.yaml pretrained/bevfusion-det.pth --eval bbox ' and I'd appreciate your assistance. Here's the error message I'm encountering: RuntimeError: CUDA error: CUBLAS_STATUS_ARCH_MISMATCH when calling cublasGemmEx( handle, opa, opb, m, n, k, &falpha, a, CUDA_R_16F, lda, b, CUDA_R_16F, ldb, &fbeta, c, CUDA_R_16F, ldc, CUDA_R_32F, CUBLAS_GEMM_DFALT_TENSOR_OP) 20240514-091859

My environment details are as follows:

OS:ubuntu22.04
gcc: 10.5.0
GPU: Quadro M6000 24GB
CUDA Version: 11.3
Driver Version: 470.239.06
Compute Capabilities of my GPU: 5.2
Python: 3.8.19
Pytorch: 1.10.1

In my setup.py file, I've included the following lines:

"-gencode=arch=compute_50,code=sm_50", "-gencode=arch=compute_52,code=sm_52",

And re-run 'python setup.py develop', I'm still encountering the mentioned error. From my research, cublasGemmEx should support compute capabilities 5.2. Could you please help me understand what might be causing this issue?

Thank you once again for your work.

Best regards

mit-han-lab / bevfusion

RuntimeError: CUDA error: CUBLAS_STATUS_ARCH_MISMATCH #615