lessw2020 / Ranger21

Ranger deep learning optimizer rewrite to use newest components
Apache License 2.0
322 stars 46 forks source link

What it the best hyper-parameter setting? #42

Open NoOneUST opened 2 years ago

NoOneUST commented 2 years ago

Hello, should I use what kind of hyper-paramter for the first try? For example, learning rate, AdamW or Madgrad?

lessw2020 commented 2 years ago

Hi @NoOneUST, I would recommend just starting with no modifications - i.e. just run Ranger as it's designed to try and establish smart defaults. After that, you can start adjusting, and would recommend learning rate as probably the first thing to adjust esp based on the initial training results. Hope that helps!

wassname commented 2 years ago

I'll just add one example I've found. For timeseries, this repo uses a subset of ranger21 https://github.com/jdb78/pytorch-forecasting/blob/master/pytorch_forecasting/optim.py.

Make of it what you will ;p