hustvl / YOLOS

[NeurIPS 2021] You Only Look at One Sequence
https://arxiv.org/abs/2106.00666
MIT License
827 stars 118 forks source link

About Learning Rate Scheduler #19

Closed 1049451037 closed 2 years ago

1049451037 commented 2 years ago

❔Question

Why the step of learning rate scheduler after each epoch instead of each batch in main.py?

Won't the change rate of lr be too slow? (and unstable for various dataset sizes)

Yuxin-CV commented 2 years ago

Hi, Qingsong. Thanks for this issue.

To my knowledge, in image recognition, the lr is usually stepped by epoch if you choose the cosine lr scheduler, e.g., in the widely used timm library.

It seems that in NLP the lr scheduler is stepped after each iteration / batch. e.g., in the BEiT repo. This is also true for semantic segmentation in vision.

I agree with you that step by iteration is more reasonable than step by epoch, and step by iteration should yield no worse results than step by epoch.

1049451037 commented 2 years ago

Got it. Thank you for your reply!