Closed VConchello closed 2 years ago
"In METAOD, we employ two strategies that help stabilize the training. First, we leverage meta-feature based (rather than random) initialization. Second, we use cyclical learning rates that help escape saddle points for better local optima [43]."
[43] L. N. Smith. Cyclical learning rates for training neural networks. In WACV, pages 464–472. IEEE Computer Society, 2017.
we do use this technique for better training :)
Mmmh, I see. Thank you for answering so fast and clear :)
What was the criterion used to choose the learning rates on core.py:118-136? It looks like it alternates between increasing and decreasing during the iterations? And at the beginning the learning rate is increasing?