karanchahal / distiller

A large scale study of Knowledge Distillation.
217 stars 30 forks source link

Is it possible to keep the learing rate constant? #3

Closed JuanDavidG1997 closed 4 years ago

JuanDavidG1997 commented 4 years ago

I am trying to run some experiments comparing the effects of different paremters. Is it possible to avoid the reduction of the learning rate? I saw something related in optimizer.py, but I would like to be sure... Thank you!

fruffy commented 4 years ago

Yes, get_scheduler and get_optimizer are just wrappers, you can set the scheduler to none and run SGD for a constant rate. Quickly pushed something to let you pick the scheduler and optimizer in the command line, run --scheduler constant and there should be no scheduler used.

JuanDavidG1997 commented 4 years ago

Works perfect Thank you. Another questios... It is unclear for me the reduction size of the learning rate when not constant. I know it starts at 0.1, but how big is it after the reductions?

fruffy commented 4 years ago

Depends on the scheduler and the configuration. Normally, it reduces the rate to 10% of the previous one. So 0.1 goes to 0.01, then 0.001 etc... We mostly rely on standard functionality defined here: https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate