Make learning rate of optimizers public

ebetica / autogradpp

Direct C++ Interface to PyTorch

MIT License

80 stars 12 forks source link

Closed jgehring closed 6 years ago

jgehring commented 6 years ago

Otherwise, there's no way to do annealing (other than creating a new optimizer?)

eugene-kharitonov commented 6 years ago

Hello @jgehring! Just for my education, I'm curious - is annealing ever performed for Adam?

jgehring commented 6 years ago

It's not very common but has been done, for example in the "Attention is all you need" paper on NMT.