Using MMDet version of VFNet with the lastest backbone (e,g. Poolformer S36, ConvNeXt Small) with Inf Issues on Varifocal loss

hyz-xmaster / VarifocalNet

VarifocalNet: An IoU-aware Dense Object Detector

Apache License 2.0

348 stars 52 forks source link

Using MMDet version of VFNet with the lastest backbone (e,g. Poolformer S36, ConvNeXt Small) with Inf Issues on Varifocal loss #26

Open cydiachen opened 2 years ago

cydiachen commented 2 years ago

Thank you for your excellent work. I am now experiment on improving VFNet with the latest model backbone. (e,g. Poolformer S36, ConvNeXt Small) The network works fine on the first 5 epochs and suffer from significant performance drop caused by unexpected Inf value of cls_loss ( In my case is varifocal loss). I am hoping for getting some advice for tracking the issue. (I have tried grad_clip to clip gradient of Inf value, but it does not solve the issue)

hyz-xmaster commented 2 years ago

Hi, if the first 5 epochs are warm-up epochs, you may set a lower learning rate. The 'Inf' value problem is possibly caused by some very large negative predictions, say -100000000, and this will lead to log(sigmoid(p)) -> Inf.