Closed shengyuan-tang closed 4 years ago
It just means your initial learning rate without changes. Flat+Anneal simply means start with a constant lr and only start annealing -after- a number of epochs (around 70-72% of total usually)
There's some schedulers to do that on GitHub (don't have a link now, i'll add it later if i find it again) OR, count the epochs somehow and start using your annealing scheduler only after a set amount of epochs has passed.
ok,i get it,thank you
Do you have a paper link for flat cosine annealing or comparison to cyclical annealing or one-cycle policy?
Hi @austinmw,
We didn't write a paper on it. It was just an invention by @grankin (on the fastai forums) b/c we noticed that running Ranger flat for a while was more effective than bouncing it up and down ala cyclical annealing.
Hope that helps!
Not sure if it is related, I have written a scheduler (https://github.com/wangg12/flat_anneal_scheduler.pytorch) for flat and cosine schedule (and many other common schedules) in just a function.
i want to do a test of your ranger,i only know cosine anneal training, can you tell me the meaning of flat?thanks