Open hcgasser opened 2 years ago
@hcgasser I think your argumentation makes sense. From my quick look at the code it looks like the smoothed loss is only used for the early stopping condition, so this should only affect the decision where to stop. Is that correct?
Have you tried making changes to the code to start with the initial loss instead of 0?
Thanks @awaelchli for your quick response and sorry for taking so much time to answer - was drowning in things to do.
The smoothed_loss variable actually goes into the losses list https://github.com/Lightning-AI/lightning/blob/a5b0f8bd5cd28fbd79fdafa5d9380b00258d7a76/src/pytorch_lightning/tuner/lr_finder.py#L379
which is then read out and stored into the _LRFinder by the lr_find method https://github.com/Lightning-AI/lightning/blob/a5b0f8bd5cd28fbd79fdafa5d9380b00258d7a76/src/pytorch_lightning/tuner/lr_finder.py#L251
and then used by the _LRFinder to suggest the optimal learning rate https://github.com/Lightning-AI/lightning/blob/a5b0f8bd5cd28fbd79fdafa5d9380b00258d7a76/src/pytorch_lightning/tuner/lr_finder.py#L201
I have tried the following changes:
The result I found was that the loss curve used for the optimal learning rate selection is then very strongly influenced by the first observed smoothed_loss - which is very high given that the network is just seeing its first batch. This compares to being very strongly influenced by zero as of now. Ways to deal with that might be using a lower beta or a warm up period? What do you think?
@hcgasser Yes, I think that's a valid observation. How about we ignore the first N steps in the selection among smoothed values (I think you called it warmup above)? We could choose N=1 or similar as the default value.
This issue has been automatically marked as stale because it hasn't had any recent activity. This issue will be closed in 7 days if no further activity occurs. Thank you for your contributions, PyTorch Lightning Team!
Any news about this issue @awaelchli ?
Discussed in https://github.com/Lightning-AI/lightning/discussions/13404
cc @borda @akihironitta @rohitgr7