LR and False Positives Inquiry

RangiLyu / nanodet

NanoDet-Plus⚡Super fast and lightweight anchor-free object detection model. 🔥Only 980 KB(int8) / 1.8MB (fp16) and run 97FPS on cellphone🔥

Apache License 2.0

5.71k stars 1.04k forks source link

LR and False Positives Inquiry #276

Open andeees opened 3 years ago

andeees commented 3 years ago

Hello, I have two brief questions:

When training w/ this cfg I get lots of False Positives. Is there any recipe (besides from more data) to mitigate this? (i.e. changing loss weights?)
I was surprised to see that high learning rate was used for many epochs before reducing it with a scheduler (in most cfg's). Was this found empirically to work better? (i.e. start with lr=0.14)

RangiLyu commented 3 years ago

This is a common problem when using Focal Loss and quality aware method(like GFL in NanoDet). One way to solve this is to use some target sampling method, but it is hightly relevant to your data. So I can not give a specific solution.
One reason to use such huge lr is because the batch size is very large here(192 per gpu). And I fond that using high learning rate to train a very long time before lr decay can improve the final mAP.

BTW, the hyperparameters in the config may not be the best because I do not have resources to search them. If anyone find a better setting on COCO, welcome PR.

andeees commented 3 years ago

@RangiLyu thanks for the reply! Final doubt:

I am a little confused, are gradients summed instead averaged across batches then? (when using sum generally lr must be adjusted based on batch size).

I will try different hyper-parameters for my custom dataset and try to alleviate (1). Thanks :)