Banconxuan / RTM3D

The official PyTorch Implementation of RTM3D and KM3D for Monocular 3D Object Detection
MIT License
454 stars 85 forks source link

Pytorch cuda error assertion failed #22

Open Owen-Liuyuxuan opened 3 years ago

Owen-Liuyuxuan commented 3 years ago

Thank you for your great work!

I was trying out your work in a newer version of pytorch (1.5.1 to be exact).

I have updated the operations of iou3d and dcn.

During training, in most cases, it goes well. However, after several epochs, the training halts with

"/opt/conda/conda-bld/magma-cuda102_1583546904148/work/interface_cuda/interface.cpp:901: void magma_queue_create_from_cuda_internal(magma_device_t, cudaStream_t, cublasHandle_t, cusparseHandle_t, magma_queue**, const char*, const char*, int): Assertion `queue->dCarray__ != __null' failed.
[1]    8178 abort (core dumped)  python3 ./src/main.py --data_dir ./kitti_format --exp_id km3d_multi_class "

It looks like a similar issue in pytorch. https://github.com/pytorch/pytorch/issues/26120 But I can't locate its position, it seems difficult to reproduce with a small scripts.

zhang1hongliang commented 2 years ago

I got the same error as you in the same line of the same file, It seems that teh reason can not be located accurately. Do you have solved it? @Owen-Liuyuxuan