GOATmessi8 / ASFF

yolov3 with mobilenet v2 and ASFF
GNU General Public License v3.0
1.05k stars 216 forks source link

During training, IndexError occurred. #82

Open Jucjiaswiss opened 4 years ago

Jucjiaswiss commented 4 years ago

Not long from training started, an error occurred as follows: [Epoch 0/500][Iter 2350/5649][lr 0.000000][Loss: anchor 12.62, iou 12.75, l1 60.10, conf 1123.60, cls 258.15, imgsize 320, time: 6.73] [Epoch 0/500][Iter 2360/5649][lr 0.000000][Loss: anchor 15.11, iou 15.44, l1 77.95, conf 3106.67, cls 311.54, imgsize 576, time: 7.44] Traceback (most recent call last): File "main.py", line 486, in <module> main() File "main.py", line 416, in main loss_dict = model(imgs, targets, epoch) File "/home/anaconda3/envs/ASFF/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in __call__ result = self.forward(*input, **kwargs) File "/home/detection_networks/ASFF/models/yolov3_asff.py", line 149, in forward x, anchor_loss, iou_loss, l1_loss, conf_loss, cls_loss = header(fused, targets) File "/home/anaconda3/envs/ASFF/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in __call__ result = self.forward(*input, **kwargs) File "/home/detection_networks/ASFF/models/yolov3_head.py", line 214, in forward pred_anchors[b, self.n_anchors-1, j, i, :4].data.cpu().view(-1,4),xyxy=False) #iou of pred anchor IndexError: index 19 is out of bounds for dimension 3 with size 18

Willert98 commented 4 years ago

same problem hava some solution?

Willert98 commented 4 years ago

@Jucjiaswiss hava some solution?help....

Willert98 commented 4 years ago

label wrong ?

Jucjiaswiss commented 4 years ago

still don't know why. sorry..

Jucjiaswiss commented 4 years ago

I changed a dataset with VOC style, trainning has no problem. My former data was VOC-convert-to-COCO style. Hope it works for you.

Willert98 commented 4 years ago

I changed a dataset with VOC style, trainning has no problem. My former data was VOC-convert-to-COCO style. Hope it works for you.

yep ,I guess same reson ,but I also cant found which is wrong。thank you~

Jucjiaswiss commented 4 years ago

[Epoch 0/500][Iter 19030/83822][lr 0.000000][Loss: anchor 19.62, iou 20.14, l1 83.97, conf 150.48, cls 2597.47, imgsize 448, time: 9.23] [Epoch 0/500][Iter 19040/83822][lr 0.000000][Loss: anchor 29.36, iou 30.13, l1 134.33, conf 136.58, cls 4027.54, imgsize 320, time: 7.31] Traceback (most recent call last): File "main.py", line 472, in main() File "main.py", line 398, in main loss_dict = model(imgs, targets, epoch) File "/home/anaconda3/envs/ASFF/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in call result = self.forward(*input, kwargs) File "/home/anaconda3/envs/ASFF/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 152, in forward outputs = self.parallel_apply(replicas, inputs, kwargs) File "/home/anaconda3/envs/ASFF/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 162, in parallel_apply return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)]) File "/home/anaconda3/envs/ASFF/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 85, in parallel_apply output.reraise() File "/home/anaconda3/envs/ASFF/lib/python3.7/site-packages/torch/_utils.py", line 385, in reraise raise self.exc_type(msg) IndexError: Caught IndexError in replica 1 on device 1. Original Traceback (most recent call last): File "/home/anaconda3/envs/ASFF/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 60, in _worker output = module(*input, *kwargs) File "/home/anaconda3/envs/ASFF/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in call result = self.forward(input, kwargs) File "/home/detection_networks/ASFF/models/yolov3_asff.py", line 149, in forward x, anchor_loss, iou_loss, l1_loss, conf_loss, cls_loss = header(fused, targets) File "/home/anaconda3/envs/ASFF/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in call result = self.forward(*input, **kwargs) File "/home/detection_networks/ASFF/models/yolov3_head.py", line 214, in forward pred_anchors[b, self.n_anchors-1, j, i, :4].data.cpu().view(-1,4),xyxy=False) #iou of pred anchor IndexError: index 19 is out of bounds for dimension 3 with size 19

using a different VOC dataset, this error still occurred after several iterations.

Jucjiaswiss commented 4 years ago

plus,custom datasets. 300 classes.