Closed enhany closed 4 years ago
@enhany
I also have often experienced this phenomenon in maskrcnn-benchmark or fcos.
I guess that this is random initialization of the weight except for backbone.
Changing LR to lower value helps (10-100 times lower).
@enhany i meet the NaN problem. if lower the lr to 10~100 times, will it hurt the performance?
@zimenglan-sysu-512 you need to upper your MAX_ITER 10-100 times. So yes, it will take more time to train.
When I try to train on COCO 2014 dataset, I get loss nan, but not always, sometimes it's going ok.
cmd line:
env:
Bad results from beginning:
Ok result: