megvii-research / Iter-E2EDET

Official implementation of the paper "Progressive End-to-End Object Detection in Crowded Scenes"
MIT License
88 stars 8 forks source link

Training COCO dataset - RuntimeError: CUDA error: unknown error #26

Closed vicdxxx closed 1 year ago

vicdxxx commented 1 year ago

Traceback (most recent call last): File "train_net.py", line 134, in launch( File "/mnt/e/PHD/BlueberryDenseDetection/Iter-E2EDET/detectron2/engine/launch.py", line 63, in launch main_func(args) File "train_net.py", line 128, in main return trainer.train() File "/mnt/e/PHD/BlueberryDenseDetection/Iter-E2EDET/detectron2/engine/defaults.py", line 431, in train super().train(self.start_iter, self.max_iter) File "/mnt/e/PHD/BlueberryDenseDetection/Iter-E2EDET/detectron2/engine/train_loop.py", line 134, in train self.run_step() File "/mnt/e/PHD/BlueberryDenseDetection/Iter-E2EDET/detectron2/engine/defaults.py", line 441, in run_step self._trainer.run_step() File "/mnt/e/PHD/BlueberryDenseDetection/Iter-E2EDET/detectron2/engine/train_loop.py", line 228, in run_step loss_dict = self.model(data) File "/mnt/f/Software/AnacondaWSL2/envs/pytorch/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(input, kwargs) File "/mnt/e/PHD/BlueberryDenseDetection/Iter-E2EDET/projects/crowd-e2e-sparse-rcnn/models/detector.py", line 147, in forward outputs_class, outputs_coord, ctns = self.head(features, proposal_boxes, self.init_proposal_features.weight, \ File "/mnt/f/Software/AnacondaWSL2/envs/pytorch/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, *kwargs) File "/mnt/e/PHD/BlueberryDenseDetection/Iter-E2EDET/projects/crowd-e2e-sparse-rcnn/models/head.py", line 116, in forward tmp_container = rcnn_head(features, container) File "/mnt/f/Software/AnacondaWSL2/envs/pytorch/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(input, kwargs) File "/mnt/e/PHD/BlueberryDenseDetection/Iter-E2EDET/projects/crowd-e2e-sparse-rcnn/models/rcnn_head.py", line 120, in forward pro_features2 = self.self_attn(pro_features, pro_features, value=pro_features)[0] File "/mnt/f/Software/AnacondaWSL2/envs/pytorch/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "/mnt/f/Software/AnacondaWSL2/envs/pytorch/lib/python3.8/site-packages/torch/nn/modules/activation.py", line 1167, in forward attn_output, attn_output_weights = F.multi_head_attention_forward( File "/mnt/f/Software/AnacondaWSL2/envs/pytorch/lib/python3.8/site-packages/torch/nn/functional.py", line 5161, in multi_head_attention_forward attn_output_weights = softmax(attn_output_weights, dim=-1) File "/mnt/f/Software/AnacondaWSL2/envs/pytorch/lib/python3.8/site-packages/torch/nn/functional.py", line 1841, in softmax ret = input.softmax(dim) RuntimeError: CUDA error: unknown error CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

Any suggestion? Thanks for help.

vicdxxx commented 1 year ago

NUM_PROPOSALS issue, I increase this value for dense detection