Eniac-Xie / faster-rcnn-resnet

ResNet Implementation for Faster-rcnn
MIT License
207 stars 117 forks source link

loss output: is this normal? #16

Open askerlee opened 7 years ago

askerlee commented 7 years ago

Hi, I'm using ResNet101_BN_SCALE_Merged_OHEM on my own dataset. Some of the output losses (loss_bbox and loss_cls) are always 0.

Update: seems there are something wrong with OHEM. When I turn off OHEM everything is normal.

I1019 22:34:34.436921 14581 solver.cpp:229] Iteration 760, loss = 0.0427504
I1019 22:34:34.436954 14581 solver.cpp:245]     Train net output #0: loss_bbox = 0 (* 1 = 0 loss)
I1019 22:34:34.436959 14581 solver.cpp:245]     Train net output #1: loss_cls = 0 (* 1 = 0 loss)
I1019 22:34:34.436962 14581 solver.cpp:245]     Train net output #2: rpn_cls_loss = 0.0208707 (* 1 = 0.0208707 loss)
I1019 22:34:34.436965 14581 solver.cpp:245]     Train net output #3: rpn_loss_bbox = 0.00372629 (* 1 = 0.00372629 loss)

The output with OHEM turned off:

I1020 14:29:00.407395 19371 solver.cpp:245]     Train net output #0: loss_bbox = 0.652186 (* 1 = 0.652186 loss)
I1020 14:29:00.407400 19371 solver.cpp:245]     Train net output #1: loss_cls = 0.654309 (* 1 = 0.654309 loss)
I1020 14:29:00.407404 19371 solver.cpp:245]     Train net output #2: rpn_cls_loss = 0.113032 (* 1 = 0.113032 loss)
I1020 14:29:00.407408 19371 solver.cpp:245]     Train net output #3: rpn_loss_bbox = 0.0568502 (* 1 = 0.0568502 loss)
whmin commented 6 years ago

@askerlee @Eniac-Xie Could you please release your "ResNet101_BN_SCALE_Merged_OHEM" model files included test.prototxt,my download files not contain it,but i have no time to write it because of an emergency.Thank you very much!!!

askerlee commented 6 years ago

Just copy the test.prototxt from the ResNet101_BN_SCALE_Merged folder. They are the same (hard example mining only happens in training, so the test model is the same).

whmin commented 6 years ago

Ok,thank you!!!Now i got an error when run ./experiments/scripts/faster_rcnn_end2end.sh 0 ResNet-50 pascal_voc,like this: screenshot from 2017-11-08 21-08-43 i did not change the original code about resnet-50 with ohem,but only replace the "num_classes" and "num_output",i can not solve it,could you help me?

askerlee commented 6 years ago

change cls_prob[i,label] to cls_prob[i,int(label)] in lib/roi_data_layer/layer.py:242.

oysz2016 commented 5 years ago

Hi, I have encountered the same problem. Have you solved it?I think it may be the problem of the code. When I used the OHEM code modified by myself to train the author's prototxt file, the loss was not 0, but it was difficult to converge (which did not exist on VGG16).