pytorch implement of CascadeRCNN,736px(max side),41.2mAP(COCO),21.94fps(RTX 2080TI)
20
stars
4
forks
source link
/opt/conda/conda-bld/pytorch_1634272126608/work/aten/src/ATen/native/cuda/Loss.cu:247: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [2,0,0] Assertion `t >= 0 && t < n_classes` failed. #3
Open
fancy-chenyao opened 1 year ago
您好,在调试您的代码的过程中遇到了这个错误,我用的自己的数据集,请问问题是出在哪里呢?具体报错如下所示,非常期待您的回复! /root/miniconda3/envs/my-env/lib/python3.7/site-packages/torch/functional.py:445: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/conda/conda-bld/pytorch_1634272126608/work/aten/src/ATen/native/TensorShape.cpp:2157.) return _VF.meshgrid(tensors, kwargs) # type: ignore[attr-defined] /opt/conda/conda-bld/pytorch_1634272126608/work/aten/src/ATen/native/cuda/Loss.cu:247: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [2,0,0] Assertion
processor.run()
File "/root/cascade-rcnn/solver/ddp_mix_solver.py", line 213, in run
self.train(epoch)
File "/root/cascade-rcnn/solver/ddp_mix_solver.py", line 113, in train
targets={"target": targets_tensor, "batch_len": batch_len})
File "/root/miniconda3/envs/my-env/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, *kwargs)
File "/root/miniconda3/envs/my-env/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 886, in forward
output = self.module(inputs[0], kwargs[0])
File "/root/miniconda3/envs/my-env/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, kwargs)
File "/root/cascade-rcnn/nets/cascade_rcnn.py", line 702, in forward
box_predicts, cls_predicts, roi_losses = self.cascade_head(feature_dict, boxes, valid_size, targets)
File "/root/miniconda3/envs/my-env/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, *kwargs)
File "/root/cascade-rcnn/nets/cascade_rcnn.py", line 621, in forward
boxes, cls, loss = self.roi_heads[i](feature_dict, boxes, valid_size, targets)
File "/root/miniconda3/envs/my-env/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(input, kwargs)
File "/root/cascade-rcnn/nets/cascade_rcnn.py", line 590, in forward
cls_loss, box_loss = self.compute_loss(proposals, cls_predicts, box_predicts, targets)
File "/root/cascade-rcnn/nets/cascade_rcnn.py", line 564, in compute_loss
cls_loss = self.ce(loss_cls_predicts, loss_cls_targets)
File "/root/miniconda3/envs/my-env/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/root/miniconda3/envs/my-env/lib/python3.7/site-packages/torch/nn/modules/loss.py", line 1152, in forward
label_smoothing=self.label_smoothing)
File "/root/miniconda3/envs/my-env/lib/python3.7/site-packages/torch/nn/functional.py", line 2846, in cross_entropy
return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing)
RuntimeError: CUDA error: device-side assert triggered
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: -6) local_rank: 0 (pid: 2085) of binary: /root/miniconda3/envs/my-env/bin/python
t >= 0 && t < n_classes
failed. /opt/conda/conda-bld/pytorch_1634272126608/work/aten/src/ATen/native/cuda/Loss.cu:247: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [3,0,0] Assertiont >= 0 && t < n_classes
failed. /opt/conda/conda-bld/pytorch_1634272126608/work/aten/src/ATen/native/cuda/Loss.cu:247: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [5,0,0] Assertiont >= 0 && t < n_classes
failed. /opt/conda/conda-bld/pytorch_1634272126608/work/aten/src/ATen/native/cuda/Loss.cu:247: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [6,0,0] Assertiont >= 0 && t < n_classes
failed. 0%| | 0/1125 [00:02<?, ?it/s] Traceback (most recent call last): File "main.py", line 7, in