longcw / faster_rcnn_pytorch

Faster RCNN with PyTorch
MIT License
1.7k stars 466 forks source link

ValueError: attempt to get argmax of an empty sequence #37

Open Cadene opened 6 years ago

Cadene commented 6 years ago

I am trying to train a model on my custom dataset (formatted like Pascal VOC). The model is training for several iterations and then this error occurs.

im_size: (97.0, 1000.0)
scale: 1.5082956552505493
height, width: (6, 62)
rpn: gt_boxes.shape (13, 5)
rpn: gt_boxes [[    0.             0.            85.97284698    63.34841537    25.        ]
 [   70.88989258     0.           144.79638672    63.34841537    33.        ]
 [  131.22172546     1.50829566   209.65309143    66.36500549     5.        ]
 [  193.06184387     0.           265.46002197    66.36500549    12.        ]
 [  256.4102478      0.           334.84161377    61.84012222     6.        ]
 [  324.28356934     4.52488708   405.73153687    66.36500549    30.        ]
 [  392.15686035     3.01659131   488.68777466    64.85671234     2.        ]
 [  461.53845215     6.03318262   549.01959229    75.41477966    22.        ]
 [  618.40118408     6.03318262   713.42382812    75.41477966     2.        ]
 [  698.34088135     6.03318262   787.33032227    70.88989258    30.        ]
 [  773.75567627     7.54147816   867.2699585     73.90648651     2.        ]
 [  825.03771973     9.04977417   975.86724854    96.53092194     7.        ]
 [  920.06030273     4.52488708  1000.            78.4313736     12.        ]]
total_anchors 3348
inds_inside 0
anchors.shape (0, 4)
[]
[[    0.             0.            85.97284698    63.34841537    25.        ]
 [   70.88989258     0.           144.79638672    63.34841537    33.        ]
 [  131.22172546     1.50829566   209.65309143    66.36500549     5.        ]
 [  193.06184387     0.           265.46002197    66.36500549    12.        ]
 [  256.4102478      0.           334.84161377    61.84012222     6.        ]
 [  324.28356934     4.52488708   405.73153687    66.36500549    30.        ]
 [  392.15686035     3.01659131   488.68777466    64.85671234     2.        ]
 [  461.53845215     6.03318262   549.01959229    75.41477966    22.        ]
 [  618.40118408     6.03318262   713.42382812    75.41477966     2.        ]
 [  698.34088135     6.03318262   787.33032227    70.88989258    30.        ]
 [  773.75567627     7.54147816   867.2699585     73.90648651     2.        ]
 [  825.03771973     9.04977417   975.86724854    96.53092194     7.        ]
 [  920.06030273     4.52488708  1000.            78.4313736     12.        ]]
Traceback (most recent call last):
  File "train.py", line 129, in <module>
    net(im_data, im_info, gt_boxes, gt_ishard, dontcare_areas)
  File "/home/cadene/anaconda3/envs/faster_rcnn/lib/python3.6/site-packages/torch/nn/modules/module.py", line 206, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/cadene/Documents/faster_rcnn_pytorch_python3/faster_rcnn/faster_rcnn.py", line 215, in forward
    features, rois = self.rpn(im_data, im_info, gt_boxes, gt_ishard, dontcare_areas)
  File "/home/cadene/anaconda3/envs/faster_rcnn/lib/python3.6/site-packages/torch/nn/modules/module.py", line 206, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/cadene/Documents/faster_rcnn_pytorch_python3/faster_rcnn/faster_rcnn.py", line 77, in forward
    im_info, self._feat_stride, self.anchor_scales)
  File "/home/cadene/Documents/faster_rcnn_pytorch_python3/faster_rcnn/faster_rcnn.py", line 148, in anchor_target_layer
    anchor_target_layer_py(rpn_cls_score, gt_boxes, gt_ishard, dontcare_areas, im_info, _feat_stride, anchor_scales)
  File "/home/cadene/Documents/faster_rcnn_pytorch_python3/faster_rcnn/rpn_msr/anchor_target_layer.py", line 150, in anchor_target_layer
    gt_argmax_overlaps = overlaps.argmax(axis=0)  # G
ValueError: attempt to get argmax of an empty sequence

Obviously it is due to the fact that all_anchors contains anchors which are not "inside the image". https://github.com/longcw/faster_rcnn_pytorch/blob/master/faster_rcnn/rpn_msr/anchor_target_layer.py#L118 I can't figure out how to fix this...

all_anchors [[  -84.   -40.    99.    55.]
 [ -176.   -88.   191.   103.]
 [ -360.  -184.   375.   199.]
 ..., 
 [  940.     0.  1027.   175.]
 [  896.   -88.  1071.   263.]
 [  808.  -264.  1159.   439.]]
total_anchors 3348
inds_inside 0
Cadene commented 6 years ago

I may have found a fix: _allowed_border = 50 https://github.com/longcw/faster_rcnn_pytorch/blob/master/faster_rcnn/rpn_msr/anchor_target_layer.py#L69

I hope it will converge nicely.

SilencerChen commented 6 years ago

Have you fix this problem? I have the same question and change the _allowed_border from 0 to 50 dose not solve it.Though I use the code in tensorflow. it will be so great if you can tell me how to fix it. Thanks so much.

Jongchan commented 6 years ago

In my case, this error occurs when there is no gt box. (I am using a custom dataset) Removing such samples from the dataset fixed the problem.

zqdeepbluesky commented 6 years ago

@SilencerChen @Jongchan hi,I met the same problem,did you fix it?can you tell me how to solve it? thanks so much!!!!

qqzeng commented 6 years ago

hi, everybody, I met the same problem, what caused my program occured error is that my DataFrame contains np.NAN. Hope it will help you! :-) ps: I'm not using faster_rcnn_pytorch, I'm just operating the DataFram.