ijkguo / mx-rcnn

Parallel Faster R-CNN implementation with MXNet.
Other
669 stars 292 forks source link

Custom Training: RuntimeWarning: invalid value encountered in greater_equal #29

Closed longwoo closed 7 years ago

longwoo commented 7 years ago

I am trying to modify some parameters in training. In some case, there are some nan appear in ws and hs in the /rcnn/rpn/proposal.py ,and will cause the error

/home/will/mx-rcnn/rcnn/rpn/proposal.py:167: RuntimeWarning: invalid value encountered in greater_equal
  keep = np.where((ws >= min_size) & (hs >= min_size))[0]
/home/will/mx-rcnn/helper/processing/bbox_transform.py:65: RuntimeWarning: invalid value encountered in subtract
  pred_boxes[:, 0::4] = pred_ctr_x - 0.5 * (pred_w - 1.0)
/home/will/mx-rcnn/helper/processing/bbox_transform.py:67: RuntimeWarning: invalid value encountered in subtract
  pred_boxes[:, 1::4] = pred_ctr_y - 0.5 * (pred_h - 1.0)
/home/will/mx-rcnn/helper/processing/bbox_transform.py:69: RuntimeWarning: invalid value encountered in add
  pred_boxes[:, 2::4] = pred_ctr_x + 0.5 * (pred_w - 1.0)
/home/will/mx-rcnn/helper/processing/bbox_transform.py:71: RuntimeWarning: invalid value encountered in add
  pred_boxes[:, 3::4] = pred_ctr_y + 0.5 * (pred_h - 1.0)

To make it robust, I am trying to delete the nan .Do you have some ideas? @precedenceguo

    @staticmethod
    def _filter_boxes(boxes, min_size):
        """ Remove all boxes with any side smaller than min_size """
        ws = boxes[:, 2] - boxes[:, 0] + 1
        hs = boxes[:, 3] - boxes[:, 1] + 1
        np.set_printoptions(threshold='nan')
        # if np.isnan(ws):
            # print("np.isnan(ws) = ")
            # print(np.isnan(ws))
        # print("ws = ")
        # print(ws)
        # print("hs = ")
        # print(hs)
        # print("min_size = ")
        # print(min_size)
        keep = np.where((ws >= min_size) & (hs >= min_size))[0]
        return keep
ijkguo commented 7 years ago

Happy hacking!

ijkguo commented 7 years ago

Do we have any updates on this? What parameters could cause this NaN problem?

fernandorovai commented 7 years ago

I am facing the same problem. Any update?

ijkguo commented 7 years ago

Actually I could not reproduce this NaN problem......

fernandorovai commented 7 years ago

I cannot train any dataset because I have this issue in all of them. Kinda crazy trying to figure it out. Thanks anyway!

ijkguo commented 7 years ago

Is is possible that the bounding box coordinates in parsed annotation are wrong? Visualize the ground truth to make sure.

fernandorovai commented 7 years ago

I am using Pascal Voc2007, I think it is safe w.r.t. annotation errors, right? The weird thing is that I don't receive the error when training without pre-trained Image Net file. Do you have any idea? Thank you very much!

ijkguo commented 7 years ago

If you have not changed anything in the code, please try other mxnet examples, e.g. train_cifar10.py to rule out environment factors.

scarlettliu644 commented 6 years ago

lower down your learning rate according to your max_itr and stepsize in four solver files. The problem should be fixed.