Closed zhangshuoneu closed 2 weeks ago
You can generally increase the learning rate as long as there are no loss spikes. It might be helpful to decay the learning rate even if a higher one initially works. If I remember correctly, higher learning rates caused more loss spikes on some scenes, which is why we chose 3e-5.
Thanks!
wonderful work! I'm a newer to deep learning. I'm confused about the setting of the learning rate when fine-tuning the network. I've noticed that overfit.py use 3e-5. Is e-4 or e-5 magintude appropriate? I look forward to hearing from you! Thanks