Closed luomingshuang closed 2 years ago
OK, thanks. I guess this confirms that we need the noam-type learning rate scheduler. At some point I'd like to try something like Adam but with weight decay where it decays to the initial (random) value, not to 0. [and also with a Noam-type scheduler]
Yes, maybe I can try to set the lr (learning rate) according to the epoch. When the epoch <=5 (or other value), setting the lr=0.0001. And when the epoch >5, setting the lr=0.00001.
OK, thanks. I guess this confirms that we need the noam-type learning rate scheduler. At some point I'd like to try something like Adam but with weight decay where it decays to the initial (random) value, not to 0. [and also with a Noam-type scheduler]
Study the Noam optimizer, how it sets the learning rate. It is just a modification of Adam, to set a certain schedule.
On Tue, Jul 6, 2021 at 4:46 PM Mingshuang Luo @.***> wrote:
Yes, maybe I can try to set the lr (learning rate) according to the epoch. When the epoch <=5 (or other value), setting the lr=0.0001. And when the epoch >5, setting the lr=0.00001.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/k2-fsa/snowfall/issues/228#issuecomment-874578540, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAZFLO536WF6ZRGIUXYEW5LTWK7F3ANCNFSM474ASOPQ .