Closed brando90 closed 2 years ago
The cosine annealer is updated once every epoch, for a total of 200-250 epochs and anneals all the way down to 0.00001 from 0.001. By updated here I mean that I simply take the next step in the cosine cycle that I am using. There are no warm restarts.
Thank you!
Are you calling the cosine annealing every epoch? Are you restarting it? Where are those details?