facebookresearch / Detectron

FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.
Apache License 2.0
26.21k stars 5.45k forks source link

RuntimeError: CUDA error: no kernel image is available for execution on the device #965

Closed cpoptic closed 4 years ago

cpoptic commented 4 years ago

PLEASE FOLLOW THESE INSTRUCTIONS BEFORE POSTING

  1. Please thoroughly read README.md, INSTALL.md, GETTING_STARTED.md, and FAQ.md
  2. Please search existing open and closed issues in case your issue has already been reported
  3. Please try to debug the issue in case you can solve it on your own before posting

After following steps 1-3 above and agreeing to provide the detailed information requested below, you may continue with posting your issue

(Delete this line and the text above it.)

Expected results

What did you expect to see?

Actual results

What did you observe instead?

Detailed steps to reproduce

E.g.:

trainer.train()

W1126 12:36:11.825089 140613154027328 checkpoint.py:214] 'roi_heads.box_predictor.cls_score.weight' has shape (81, 1024) in the checkpoint but (22, 1024) in the model! Skipped. W1126 12:36:11.826922 140613154027328 checkpoint.py:214] 'roi_heads.box_predictor.cls_score.bias' has shape (81,) in the checkpoint but (22,) in the model! Skipped. W1126 12:36:11.827631 140613154027328 checkpoint.py:214] 'roi_heads.box_predictor.bbox_pred.weight' has shape (320, 1024) in the checkpoint but (84, 1024) in the model! Skipped. W1126 12:36:11.828277 140613154027328 checkpoint.py:214] 'roi_heads.box_predictor.bbox_pred.bias' has shape (320,) in the checkpoint but (84,) in the model! Skipped. W1126 12:36:11.829076 140613154027328 checkpoint.py:214] 'roi_heads.mask_head.predictor.weight' has shape (80, 256, 1, 1) in the checkpoint but (21, 256, 1, 1) in the model! Skipped. W1126 12:36:11.829612 140613154027328 checkpoint.py:214] 'roi_heads.mask_head.predictor.bias' has shape (80,) in the checkpoint but (21,) in the model! Skipped.

RuntimeError Traceback (most recent call last)

in 1 trainer = DefaultTrainer(cfg) 2 trainer.resume_or_load(resume=False) ----> 3 trainer.train() ~/repos/detectron2/detectron2/engine/defaults.py in train(self) 352 OrderedDict of results, if evaluation is enabled. Otherwise None. 353 """ --> 354 super().train(self.start_iter, self.max_iter) 355 if hasattr(self, "_last_eval_results") and comm.is_main_process(): 356 verify_results(self.cfg, self._last_eval_results) ~/repos/detectron2/detectron2/engine/train_loop.py in train(self, start_iter, max_iter) 130 for self.iter in range(start_iter, max_iter): 131 self.before_step() --> 132 self.run_step() 133 self.after_step() 134 finally: ~/repos/detectron2/detectron2/engine/train_loop.py in run_step(self) 210 If your want to do something with the losses, you can wrap the model. 211 """ --> 212 loss_dict = self.model(data) 213 losses = sum(loss for loss in loss_dict.values()) 214 self._detect_anomaly(losses, loss_dict) ~/anaconda3/envs/detectron2_cuda_9_2/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs) 539 result = self._slow_forward(*input, **kwargs) 540 else: --> 541 result = self.forward(*input, **kwargs) 542 for hook in self._forward_hooks.values(): 543 hook_result = hook(self, input, result) ~/repos/detectron2/detectron2/modeling/meta_arch/rcnn.py in forward(self, batched_inputs) 80 81 if self.proposal_generator: ---> 82 proposals, proposal_losses = self.proposal_generator(images, features, gt_instances) 83 else: 84 assert "proposals" in batched_inputs[0] ~/anaconda3/envs/detectron2_cuda_9_2/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs) 539 result = self._slow_forward(*input, **kwargs) 540 else: --> 541 result = self.forward(*input, **kwargs) 542 for hook in self._forward_hooks.values(): 543 hook_result = hook(self, input, result) ~/repos/detectron2/detectron2/modeling/proposal_generator/rpn.py in forward(***failed resolving arguments***) 177 self.post_nms_topk[self.training], 178 self.min_box_side_len, --> 179 self.training, 180 ) 181 # For RPN-only models, the proposals are the final output and we return them in ~/repos/detectron2/detectron2/modeling/proposal_generator/rpn_outputs.py in find_top_rpn_proposals(proposals, pred_objectness_logits, images, nms_thresh, pre_nms_topk, post_nms_topk, min_box_side_len, training) 134 boxes, scores_per_img, lvl = boxes[keep], scores_per_img[keep], level_ids[keep] 135 --> 136 keep = batched_nms(boxes.tensor, scores_per_img, lvl, nms_thresh) 137 # In Detectron1, there was different behavior during training vs. testing. 138 # (https://github.com/facebookresearch/Detectron/issues/459) ~/repos/detectron2/detectron2/layers/nms.py in batched_nms(boxes, scores, idxs, iou_threshold) 15 # Investigate after having a fully-cuda NMS op. 16 if len(boxes) < 40000: ---> 17 return box_ops.batched_nms(boxes, scores, idxs, iou_threshold) 18 19 result_mask = scores.new_zeros(scores.size(), dtype=torch.bool) ~/anaconda3/envs/detectron2_cuda_9_2/lib/python3.7/site-packages/torchvision/ops/boxes.py in batched_nms(boxes, scores, idxs, iou_threshold) 70 offsets = idxs.to(boxes) * (max_coordinate + 1) 71 boxes_for_nms = boxes + offsets[:, None] ---> 72 keep = nms(boxes_for_nms, scores, iou_threshold) 73 return keep 74 ~/anaconda3/envs/detectron2_cuda_9_2/lib/python3.7/site-packages/torchvision/ops/boxes.py in nms(boxes, scores, iou_threshold) 31 """ 32 _C = _lazy_import() ---> 33 return _C.nms(boxes, scores, iou_threshold) 34 35 RuntimeError: CUDA error: no kernel image is available for execution on the device (nms_cuda at /tmp/pip-req-build-ekueqync/torchvision/csrc/cuda/nms_cuda.cu:127) frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string, std::allocator > const&) + 0x6d (0x7fe26495ce7d in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/lib/python3.7/site-packages/torch/lib/libc10.so) frame #1: nms_cuda(at::Tensor const&, at::Tensor const&, float) + 0x8d1 (0x7fe230278ece in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/lib/python3.7/site-packages/torchvision/_C.so) frame #2: nms(at::Tensor const&, at::Tensor const&, float) + 0x183 (0x7fe23023ced7 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/lib/python3.7/site-packages/torchvision/_C.so) frame #3: + 0x79cf5 (0x7fe230256cf5 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/lib/python3.7/site-packages/torchvision/_C.so) frame #4: + 0x765b0 (0x7fe2302535b0 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/lib/python3.7/site-packages/torchvision/_C.so) frame #5: + 0x70d1e (0x7fe23024dd1e in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/lib/python3.7/site-packages/torchvision/_C.so) frame #6: + 0x70fc2 (0x7fe23024dfc2 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/lib/python3.7/site-packages/torchvision/_C.so) frame #7: + 0x5be4a (0x7fe230238e4a in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/lib/python3.7/site-packages/torchvision/_C.so) frame #8: _PyMethodDef_RawFastCallKeywords + 0x264 (0x5647d51a1c34 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #9: _PyCFunction_FastCallKeywords + 0x21 (0x5647d51a1d51 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #10: _PyEval_EvalFrameDefault + 0x4ebc (0x5647d520e0ac in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #11: _PyFunction_FastCallKeywords + 0xfb (0x5647d51a11ab in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #12: _PyEval_EvalFrameDefault + 0x416 (0x5647d5209606 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #13: _PyFunction_FastCallKeywords + 0xfb (0x5647d51a11ab in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #14: _PyEval_EvalFrameDefault + 0x4b29 (0x5647d520dd19 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #15: _PyFunction_FastCallKeywords + 0xfb (0x5647d51a11ab in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #16: _PyEval_EvalFrameDefault + 0x416 (0x5647d5209606 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #17: _PyFunction_FastCallKeywords + 0xfb (0x5647d51a11ab in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #18: _PyEval_EvalFrameDefault + 0x416 (0x5647d5209606 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #19: _PyEval_EvalCodeWithName + 0xab8 (0x5647d5151978 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #20: _PyFunction_FastCallDict + 0x1d5 (0x5647d51522a5 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #21: _PyObject_Call_Prepend + 0x63 (0x5647d5170e33 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #22: PyObject_Call + 0x6e (0x5647d5163a3e in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #23: _PyEval_EvalFrameDefault + 0x1f3a (0x5647d520b12a in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #24: _PyEval_EvalCodeWithName + 0x2f9 (0x5647d51511b9 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #25: _PyFunction_FastCallDict + 0x1d5 (0x5647d51522a5 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #26: _PyObject_Call_Prepend + 0x63 (0x5647d5170e33 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #27: + 0x16a2da (0x5647d51a82da in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #28: _PyObject_FastCallKeywords + 0x49b (0x5647d51a919b in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #29: _PyEval_EvalFrameDefault + 0x52e6 (0x5647d520e4d6 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #30: _PyEval_EvalCodeWithName + 0xab8 (0x5647d5151978 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #31: _PyFunction_FastCallDict + 0x1d5 (0x5647d51522a5 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #32: _PyObject_Call_Prepend + 0x63 (0x5647d5170e33 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #33: PyObject_Call + 0x6e (0x5647d5163a3e in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #34: _PyEval_EvalFrameDefault + 0x1f3a (0x5647d520b12a in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #35: _PyEval_EvalCodeWithName + 0x2f9 (0x5647d51511b9 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #36: _PyFunction_FastCallDict + 0x1d5 (0x5647d51522a5 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #37: _PyObject_Call_Prepend + 0x63 (0x5647d5170e33 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #38: + 0x16a2da (0x5647d51a82da in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #39: _PyObject_FastCallKeywords + 0x49b (0x5647d51a919b in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #40: _PyEval_EvalFrameDefault + 0x52e6 (0x5647d520e4d6 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #41: _PyFunction_FastCallKeywords + 0xfb (0x5647d51a11ab in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #42: _PyEval_EvalFrameDefault + 0x6a3 (0x5647d5209893 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #43: _PyFunction_FastCallKeywords + 0xfb (0x5647d51a11ab in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #44: _PyEval_EvalFrameDefault + 0x4b29 (0x5647d520dd19 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #45: _PyEval_EvalCodeWithName + 0x5da (0x5647d515149a in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #46: _PyFunction_FastCallKeywords + 0x387 (0x5647d51a1437 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #47: _PyEval_EvalFrameDefault + 0x6a3 (0x5647d5209893 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #48: _PyEval_EvalCodeWithName + 0x2f9 (0x5647d51511b9 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #49: PyEval_EvalCodeEx + 0x44 (0x5647d5152094 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #50: PyEval_EvalCode + 0x1c (0x5647d51520bc in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #51: + 0x1daeb0 (0x5647d5218eb0 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #52: _PyMethodDef_RawFastCallKeywords + 0xe9 (0x5647d51a1ab9 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #53: _PyCFunction_FastCallKeywords + 0x21 (0x5647d51a1d51 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #54: _PyEval_EvalFrameDefault + 0x4784 (0x5647d520d974 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #55: _PyGen_Send + 0x2a2 (0x5647d51a9e32 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #56: _PyEval_EvalFrameDefault + 0x1a88 (0x5647d520ac78 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #57: _PyGen_Send + 0x2a2 (0x5647d51a9e32 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #58: _PyEval_EvalFrameDefault + 0x1a88 (0x5647d520ac78 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #59: _PyGen_Send + 0x2a2 (0x5647d51a9e32 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #60: _PyMethodDef_RawFastCallKeywords + 0x8d (0x5647d51a1a5d in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #61: _PyMethodDescr_FastCallKeywords + 0x4f (0x5647d51a8c6f in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #62: _PyEval_EvalFrameDefault + 0x4c7b (0x5647d520de6b in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python) frame #63: _PyFunction_FastCallKeywords + 0xfb (0x5647d51a11ab in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)

System information

RUnning on a Conda environment with Detectron2 installed I downgraded from CUDA 10.1 to CUDA 9.2 to fix an earlier bug involved in

no kernel image is available for execution on the device

!nvcc -V nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2018 NVIDIA Corporation Built on Sat_Aug_25_21:08:01_CDT_2018 Cuda compilation tools, release 10.0, V10.0.130

nvidia-smi +-----------------------------------------------------------------------------+ | NVIDIA-SMI 430.50 Driver Version: 430.50 CUDA Version: 10.1

gcc --version gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0 Copyright (C) 2017 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

ppwwyyxx commented 4 years ago

Detectron and detectron2 are two different projects. Your error is described in https://github.com/facebookresearch/detectron2/blob/master/INSTALL.md#common-installation-issues. If this does not solve the problem, please include details about the problem following detectron2's issue template.