WXinlong / SOLO

SOLO and SOLOv2 for instance segmentation, ECCV 2020 & NeurIPS 2020.
Other
1.69k stars 307 forks source link

CUDA error: CUBLAS_STATUS_NOT_INITIALIZED when calling cublasCreate(handle) #180

Closed BIGWangYuDong closed 2 years ago

BIGWangYuDong commented 3 years ago

Hi, I tried to run SOLOv1 on the slurm, by using GPUS=8 GPUS_PER_TASK=8 ./tools/slurm_train.sh XX XX config/solo/solo_r50_1x.py

while validating, it got a CUDA error CUDA error: CUBLAS_STATUS_NOT_INITIALIZED when calling cublasCreate(handle) 企业微信截图_16250431208763

This seems got an error at torch.mm, but I don't know how to fix this bug. When I rerun the code, it won’t have this bug anymore.

I run the code under Pytorch 1.4. CUDA 9.0 Meanwhile, I rewrite your code under MMDetection v2.0+, with Pytorch 1.5, CUDA 9.0. And sometimes, I also get this bug.

Do you know how to fix this bug? or have any Ideas?

Yudong