Open y200504040u opened 4 years ago
Nice copy LOL. By the way, I think it's because your learning rate is too big. I think you can try to lower it 10-100 times. And don't forget to longer your iteration.
Nice copy LOL. By the way, I think it's because your learning rate is too big. I think you can try to lower it 10-100 times. And don't forget to longer your iteration.
cut-and-pasted😂... I tried lower learning rate, I got loss without decreasing instead of loss explosion. I read vovNet paper, author didn't use vovNet to be backbone in any object detection network except RefineDet in experiments.
Same error, can't manage to fit a vovnet-lite-dw or a vovnet-19-dw, keep getting NaN loss. Vovnet-lite is fine tho, I have the feeling that there is something wrong with the depthwise convolution.
When I tested this kind of lightweight backbone in object detection (ex, mobilenet, shufflenet etc..), i set warm up iter longer.
Hi! Thank you for your great work. I wanted to improve RetinaNet project in detectron2/projects by replacing "retinanet_resnet_fpn_backbone" with "retinanet_vovnet_fpn_backbone". However, I always encounterd "loss NaN" in period of less than 1000 iterations during training . Training by "retinanet_resnet_fpn_backbone" is OK.
I want to make sure that I wasn't doing something wrong.
my config yaml:
build_retinanet_vovnet_fpn_backbone