open-mmlab / mmdetection

OpenMMLab Detection Toolbox and Benchmark
https://mmdetection.readthedocs.io
Apache License 2.0
29.62k stars 9.47k forks source link

Run time CUDA error #11412

Open rajnishrajput12 opened 10 months ago

rajnishrajput12 commented 10 months ago

I am getting RuntimeError: no kernel image is available for execution on this device. below if my config

python- 3.8 mmcv- 1.7.2(wheel file installation) (https://download.openmmlab.com/mmcv/dist/cu121/torch2.1.0/mmcv_full-1.7.2-cp38-cp38-manylinux1_x86_64.whl) mmdet-2.28.1 torch==2.1.0 torchvision=0.16.0 CUDA-12.1 NVIDIA A100-SXM4-80GB

the error starts from mmdet/api/inference.py(line 157) then it goes to mmcv/ops/nms.py(line 350)

error traceback Traceback (most recent call last): File "table_detection_server.py", line 46, in get_table_bbox bboxes = worker.get_table_bbox(file_name, page_num, src_fname, bucket_nm) File "/app/aafa/aafa_parser/aafa_parser_table_detection/utils/gpu_handler.py", line 54, in get_table_bbox bboxes = Worker.model.get_table_bbox(page) File "/app/aafa/aafa_parser/aafa_parser_table_detection/utils/nn_utils.py", line 137, in get_table_bbox result = self.get_table_preds(temp_file) File "/app/aafa/aafa_parser/aafa_parser_table_detection/utils/nn_utils.py", line 42, in get_table_preds return inference_detector(self.model, temp_file) File "/app/aafa/aafa_parser/aafa_parser_table_detection/mmdetection/mmdet/apis/inference.py", line 157, in inference_detector results = model(return_loss=False, rescale=True, data) File "/opt/app-root/lib64/python3.8/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "/opt/app-root/lib64/python3.8/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(args, kwargs) File "/opt/app-root/lib64/python3.8/site-packages/mmcv/runner/fp16_utils.py", line 119, in new_func return old_func(*args, kwargs) File "/app/aafa/aafa_parser/aafa_parser_table_detection/mmdetection/mmdet/models/detectors/base.py", line 174, in forward return self.forward_test(img, img_metas, kwargs) File "/app/aafa/aafa_parser/aafa_parser_table_detection/mmdetection/mmdet/models/detectors/base.py", line 147, in forward_test return self.simple_test(imgs[0], img_metas[0], kwargs) File "/app/aafa/aafa_parser/aafa_parser_table_detection/mmdetection/mmdet/models/detectors/two_stage.py", line 179, in simple_test proposal_list = self.rpn_head.simple_test_rpn(x, img_metas) File "/app/aafa/aafa_parser/aafa_parser_table_detection/mmdetection/mmdet/models/dense_heads/dense_test_mixins.py", line 130, in simple_test_rpn proposal_list = self.get_bboxes(rpn_outs, img_metas=img_metas) File "/opt/app-root/lib64/python3.8/site-packages/mmcv/runner/fp16_utils.py", line 208, in new_func return old_func(args, kwargs) File "/app/aafa/aafa_parser/aafa_parser_table_detection/mmdetection/mmdet/models/dense_heads/base_dense_head.py", line 102, in get_bboxes results = self._get_bboxes_single(cls_score_list, bbox_pred_list, File "/app/aafa/aafa_parser/aafa_parser_table_detection/mmdetection/mmdet/models/dense_heads/rpn_head.py", line 185, in _get_bboxes_single return self._bbox_post_process(mlvl_scores, mlvl_bbox_preds, File "/app/aafa/aafa_parser/aafa_parser_table_detection/mmdetection/mmdet/models/dense_heads/rpn_head.py", line 231, in _bbox_postprocess dets, = batched_nms(proposals, scores, ids, cfg.nms) File "/opt/app-root/lib64/python3.8/site-packages/mmcv/ops/nms.py", line 350, in batched_nms dets, keep = nms_op(boxes_for_nms, scores, nmscfg) File "/opt/app-root/lib64/python3.8/site-packages/mmcv/utils/misc.py", line 340, in new_func output = old_func(*args, *kwargs) File "/opt/app-root/lib64/python3.8/site-packages/mmcv/ops/nms.py", line 175, in nms inds = NMSop.apply(boxes, scores, iou_threshold, offset, score_threshold, File "/opt/app-root/lib64/python3.8/site-packages/torch/autograd/function.py", line 539, in apply return super().apply(args, kwargs) # type: ignore[misc] File "/opt/app-root/lib64/python3.8/site-packages/mmcv/ops/nms.py", line 28, in forward inds = ext_module.nms( RuntimeError: CUDA error: no kernel image is available for execution on the device Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

Error - CUDA error: no kernel image is available for execution on the device Compile with TORCH_USE_CUDA_DSA to enable device-side assertions. in parsing page 0 of file tmp/estorage_RG_kcl-2022-annual-ff1ea1.pdf [2024-01-19 19:11:32,313] ERROR in app: Exception on /get_table_bbox [GET] Traceback (most recent call last):

Sangh0 commented 6 months ago

Did you solve this problem? I also faced the same problem

rajnishrajput12 commented 6 months ago

for me i took wheel of mmcv ( python 3.8 ) https://download.openmmlab.com/mmcv/dist/cu117/torch1.13.0/mmcv_full-1.7.2-cp38-cp38-manylinux1_x86_64.whl and kept mmdet 2.28.1 cuda 12.1 NVIDIA A100