open-mmlab / mmdetection3d

OpenMMLab's next-generation platform for general 3D object detection.
https://mmdetection3d.readthedocs.io/en/latest/
Apache License 2.0
5.18k stars 1.52k forks source link

[Bug] Pointpillars training fails on Kitti after 2 epoch (ERROR: numba.cuda.cudadrv.driver.LinkerError and ptxas application ptx input, line 9; fatal : Unsupported .version 7.6; current version is '7.4') #2720

Open Mufan187569 opened 1 year ago

Mufan187569 commented 1 year ago

Prerequisite

Task

I'm using the official example scripts/configs for the officially supported tasks/models/datasets.

Branch

main branch https://github.com/open-mmlab/mmdetection3d

Environment

sys.platform: linux Python: 3.8.17 (default, Jul 5 2023, 21:04:15) [GCC 11.2.0] CUDA available: True numpy_random_seed: 2147483648 GPU 0: NVIDIA GeForce GTX 1070 Ti CUDA_HOME: /home/lw/miniconda3/envs/openmmlab NVCC: Cuda compilation tools, release 11.6, V11.6.124 GCC: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0 PyTorch: 1.13.1 PyTorch compiling details: PyTorch built with:

TorchVision: 0.14.1 OpenCV: 4.8.0 MMEngine: 0.8.4 MMDetection: 3.1.0 MMDetection3D: 1.2.0+c04831c spconv2.0: False Numba: 0.56.4 Numpy: 1.19.5

Reproduces the problem - code sample

none

Reproduces the problem - command or script

python python tools/train.py configs/pointpillars/pointpillars_hv_secfpn_8xb6-160e_kitti-3d-car.py

Reproduces the problem - error message

running python python tools/train.py configs/pointpillars/pointpillars_hv_secfpn_8xb6-160e_kitti-3d-car.py on KITTI dataset after the first epoch ends I get:

Converting 3D prediction to KITTI format [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 50/50, 895.5 task/s, elapsed: 0s, ETA: 0s Result is saved to /tmp/tmpqdiad9q5/results/pred_instances_3d.pkl. Traceback (most recent call last): File "/home/lw/miniconda3/envs/openmmlab/lib/python3.8/site-packages/numba/cuda/cudadrv/driver.py", line 2705, in add_ptx driver.cuLinkAddData(self.handle, enums.CU_JIT_INPUT_PTX, File "/home/lw/miniconda3/envs/openmmlab/lib/python3.8/site-packages/numba/cuda/cudadrv/driver.py", line 320, in safe_cuda_api_call self._check_ctypes_error(fname, retcode) File "/home/lw/miniconda3/envs/openmmlab/lib/python3.8/site-packages/numba/cuda/cudadrv/driver.py", line 388, in _check_ctypes_error raise CudaAPIError(retcode, msg) numba.cuda.cudadrv.driver.CudaAPIError: [222] Call to cuLinkAddData results in UNKNOWN_CUDA_ERROR

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "tool/train.py", line 135, in main() File "tool/train.py", line 131, in main runner.train() File "/home/lw/miniconda3/envs/openmmlab/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1745, in train model = self.train_loop.run() # type: ignore File "/home/lw/miniconda3/envs/openmmlab/lib/python3.8/site-packages/mmengine/runner/loops.py", line 102, in run self.runner.val_loop.run() File "/home/lw/miniconda3/envs/openmmlab/lib/python3.8/site-packages/mmengine/runner/loops.py", line 366, in run metrics = self.evaluator.evaluate(len(self.dataloader.dataset)) File "/home/lw/miniconda3/envs/openmmlab/lib/python3.8/site-packages/mmengine/evaluator/evaluator.py", line 79, in evaluate _results = metric.evaluate(size) File "/home/lw/miniconda3/envs/openmmlab/lib/python3.8/site-packages/mmengine/evaluator/metric.py", line 133, in evaluate _metrics = self.compute_metrics(results) # type: ignore File "/home/lw/LW/mmdetection3d/mmdet3d/evaluation/metrics/kitti_metric.py", line 205, in compute_metrics ap_dict = self.kitti_evaluate( File "/home/lw/LW/mmdetection3d/mmdet3d/evaluation/metrics/kitti_metric.py", line 244, in kitti_evaluate ap_result_str, apdict = kitti_eval( File "/home/lw/LW/mmdetection3d/mmdet3d/evaluation/functional/kitti_utils/eval.py", line 725, in kitti_eval mAP40_3d, mAP40_aos = do_eval(gt_annos, dt_annos, File "/home/lw/LW/mmdetection3d/mmdet3d/evaluation/functional/kitti_utils/eval.py", line 626, in do_eval ret = eval_class(gt_annos, dt_annos, current_classes, difficultys, 1, File "/home/lw/LW/mmdetection3d/mmdet3d/evaluation/functional/kitti_utils/eval.py", line 480, in eval_class rets = calculate_iou_partly(dt_annos, gt_annos, metric, num_parts) File "/home/lw/LW/mmdetection3d/mmdet3d/evaluation/functional/kitti_utils/eval.py", line 384, in calculate_iou_partly overlap_part = bev_box_overlap(dt_boxes, File "/home/lw/LW/mmdetection3d/mmdet3d/evaluation/functional/kitti_utils/eval.py", line 118, in bev_box_overlap from .rotate_iou import rotate_iou_gpu_eval File "/home/lw/LW/mmdetection3d/mmdet3d/evaluation/functional/kitti_utils/rotate_iou.py", line 283, in def rotate_iou_kernel_eval(N, File "/home/lw/miniconda3/envs/openmmlab/lib/python3.8/site-packages/numba/cuda/decorators.py", line 115, in _jit disp.compile(argtypes) File "/home/lw/miniconda3/envs/openmmlab/lib/python3.8/site-packages/numba/cuda/dispatcher.py", line 796, in compile kernel.bind() File "/home/lw/miniconda3/envs/openmmlab/lib/python3.8/site-packages/numba/cuda/dispatcher.py", line 178, in bind self._codelibrary.get_cufunc() File "/home/lw/miniconda3/envs/openmmlab/lib/python3.8/site-packages/numba/cuda/codegen.py", line 208, in get_cufunc cubin = self.get_cubin(cc=device.compute_capability) File "/home/lw/miniconda3/envs/openmmlab/lib/python3.8/site-packages/numba/cuda/codegen.py", line 181, in get_cubin linker.add_ptx(ptx.encode()) File "/home/lw/miniconda3/envs/openmmlab/lib/python3.8/site-packages/numba/cuda/cudadrv/driver.py", line 2708, in add_ptx raise LinkerError("%s\n%s" % (e, self.error_log)) numba.cuda.cudadrv.driver.LinkerError: [222] Call to cuLinkAddData results in UNKNOWN_CUDA_ERROR ptxas application ptx input, line 9; fatal : Unsupported .version 7.6; current version is '7.4'

Additional information

No response

sunjiahao1999 commented 1 year ago

Because your version of numba doesn't work. Reinstall the appropriate version of numba