endernewton / tf-faster-rcnn

Tensorflow Faster RCNN for Object Detection
https://arxiv.org/pdf/1702.02138.pdf
MIT License
3.65k stars 1.57k forks source link

Strange (pdb) when training on my own dataset #313

Open SharonZhu opened 6 years ago

SharonZhu commented 6 years ago

I am training tf-faster-rcnn on Caltech pedestrian dataset, and after several steps of training, the log turns to be this: speed: 0.220s / iter

/home/zhuxy/tf-faster-rcnn/lib/layer_utils/proposal_target_layer.py(138)_sample_rois() -> keep_inds = np.append(fg_inds, bg_inds) (Pdb) Would someone help me? Thanks a lot.

a-scotto commented 6 years ago

I have encountered the same problem when training on Coco dataset. The script got stuck at "(Pdb)" but neither CPU nor GPU resources are used which let think nothing is working on after this point !

Thanks to whoever comes with a solution!

niuniu111 commented 6 years ago

I have encountered the same problem,do you fix it?

possatti commented 6 years ago

Different from SharonZhu, I'm having this problem even before the first step. I realized the (pdb) prompt comes from the Python Debbuger (https://docs.python.org/2/library/pdb.html). The pdb seems to be launched on the file ./lib/layer_utils/proposal_target_layer.py on a specific conditional (_sample_rois function, around line 134).

# Small modification to the original version where we ensure a fixed number of regions are sampled
if fg_inds.size > 0 and bg_inds.size > 0:
    fg_rois_per_image = min(fg_rois_per_image, fg_inds.size)
    fg_inds = npr.choice(fg_inds, size=int(fg_rois_per_image), replace=False)
    bg_rois_per_image = rois_per_image - fg_rois_per_image
    to_replace = bg_inds.size < bg_rois_per_image
    bg_inds = npr.choice(bg_inds, size=int(bg_rois_per_image), replace=to_replace)
elif fg_inds.size > 0:
    to_replace = fg_inds.size < rois_per_image
    fg_inds = npr.choice(fg_inds, size=int(rois_per_image), replace=to_replace)
    fg_rois_per_image = rois_per_image
elif bg_inds.size > 0:
    to_replace = bg_inds.size < rois_per_image
    bg_inds = npr.choice(bg_inds, size=int(rois_per_image), replace=to_replace)
    fg_rois_per_image = 0
else:
    import pdb  # HERE <<<<<<<<<<<<<
    pdb.set_trace()

But I have no idea why this is happening, or how to fix it.

All of you are implementing your own dataset, like me? Or are you using one of the datasets already made? Perhaps we are implementing our datasets wrong. I don't know.

possatti commented 6 years ago

I was able to get around this by setting __C.TRAIN.BG_THRESH_LO to 0 on the file ./lib/model/config.py. After doing this I trained my model successfully and everything is going well.

possatti commented 6 years ago

Different from SharonZhu, I'm having this problem even before the first step. I realized the (pdb) prompt comes from the Python Debbuger (https://docs.python.org/2/library/pdb.html). The pdb seems to be launched on the file ./lib/layer_utils/proposal_target_layer.py on a specific conditional (_sample_rois function, around line 134).

# Small modification to the original version where we ensure a fixed number of regions are sampled
if fg_inds.size > 0 and bg_inds.size > 0:
    fg_rois_per_image = min(fg_rois_per_image, fg_inds.size)
    fg_inds = npr.choice(fg_inds, size=int(fg_rois_per_image), replace=False)
    bg_rois_per_image = rois_per_image - fg_rois_per_image
    to_replace = bg_inds.size < bg_rois_per_image
    bg_inds = npr.choice(bg_inds, size=int(bg_rois_per_image), replace=to_replace)
elif fg_inds.size > 0:
    to_replace = fg_inds.size < rois_per_image
    fg_inds = npr.choice(fg_inds, size=int(rois_per_image), replace=to_replace)
    fg_rois_per_image = rois_per_image
elif bg_inds.size > 0:
    to_replace = bg_inds.size < rois_per_image
    bg_inds = npr.choice(bg_inds, size=int(rois_per_image), replace=to_replace)
    fg_rois_per_image = 0
else:
    import pdb  # HERE <<<<<<<<<<<<<
    pdb.set_trace()

But I have no idea why this is happening, or how to fix it.

All of you are implementing your own dataset, like me? Or are you using one of the datasets already made? Perhaps we are implementing our datasets wrong. I don't know.

lu-jian-dong commented 6 years ago

this code is ok[lib/layer_utils/proposal_target_layer.py] if fg_inds.size > 0 and bg_inds.size > 0: fg_rois_per_image = min(fg_rois_per_image, fg_inds.size) fg_inds = npr.choice(fg_inds, size=int(fg_rois_per_image), replace=False) bg_rois_per_image = rois_per_image - fg_rois_per_image to_replace = bg_inds.size < bg_rois_per_image bg_inds = npr.choice(bg_inds, size=int(bg_rois_per_image), replace=to_replace) elif fg_inds.size > 0: to_replace = fg_inds.size < rois_per_image fg_inds = npr.choice(fg_inds, size=int(rois_per_image), replace=to_replace) fg_rois_per_image = rois_per_image elif bg_inds.size > 0: to_replace = bg_inds.size < rois_per_image bg_inds = npr.choice(bg_inds, size=int(rois_per_image), replace=to_replace) fg_rois_per_image = 0 else:

import pdb

#pdb.set_trace()
bg_inds = np.where((max_overlaps < cfg.TRAIN.BG_THRESH_HI) &(max_overlaps >= 0))[0]
to_replace = bg_inds.size < rois_per_image
bg_inds = npr.choice(bg_inds, size=int(rois_per_image), replace=to_replace)
fg_rois_per_image = 0
ZhuoyaYang commented 4 years ago

The anchor size of to small or too big for your datasets. Try to change ANCHORS in train_faster_rcnn.sh.