About adjusting the learning rate

kemaloksuz / RankSortLoss

Official PyTorch Implementation of Rank & Sort Loss for Object Detection and Instance Segmentation [ICCV2021]

Apache License 2.0

238 stars 25 forks source link

About adjusting the learning rate #15

Closed sanmulab closed 2 years ago

sanmulab commented 2 years ago

Hi author, thank you for your excellent work! I want to train my own datasets with rs_cascade_rcnn, so I'm wondering how should I set a reasonable learning rate (only one GPU, batch_size=2). Looking forward to your reply!

kemaloksuz commented 2 years ago

Hi,

Thanks for your interest in our work. It may be difficult to suggest a learning rate for a new dataset and a different batch size. However, I can try to describe what I would do:

If you have an idea regarding the learning rate of the standard Cascade RCNN, you can start with it and search with 0.0005 precion. If not, based on COCO experiments, your batch size and the linear scaling rule, I would start with 0.0015 and search again in 0.0005 precision.

Hope this helps,

Kemal

sanmulab commented 2 years ago

Thanks to the author for the reply! I have two questions: 1. I would like to know how many GPUs did you use for training? 2. Regarding the score_thr in test_cfg, why is it set so high? Generally speaking, setting a lower threshold should improve accuracy, so I don't understand why score_thr is set so large by you.

kemaloksuz commented 2 years ago

We used 4 GPUs and 4images on each GPU. So the total batch size is 16 images in our setting.
You are right. A smaller score threshold can imply a larger AP, and you can also set to 0.05 for your setting to obtain 0.1-0.3 higher AP. However, a larger score implies longer inference time due to the remaining high number of detections as the input of the NMS. We observed that the detectors trained by our RS Loss have larger confidence scores than the conventional trained models. As a result, just for better inference time, we set score threshold larger. Please see Section C.6.2 and Table A.21 (also A.20 can be useful) in the appendix of the paper for more discussion: https://arxiv.org/pdf/2107.11669.pdf

If you have more questions please let me know.

sanmulab commented 2 years ago

I see, thank you author!