issues
search
ih4cku
/
blog
deprecated, Git issues are great for writing blogs :)
2
stars
0
forks
source link
adaptive learning rate
#38
Open
ih4cku
opened
8 years ago
ih4cku
commented
8 years ago
Chapter 4. Beyond Gradient Descent
Alec Radford's animations for optimization algorithms
An overview of gradient descent optimization algorithms
Caffe solvers
COMPARISON: SGD VS MOMENTUM VS RMSPROP VS MOMENTUM+RMSPROP VS ADAGRAD
What are differences between update rules like AdaDelta, RMSProp, AdaGrad and AdaM?
ih4cku
commented
8 years ago
https://zhuanlan.zhihu.com/p/22252270
http://climin.readthedocs.io/en/latest/index.html
https://blog.wtf.sg/2014/08/28/implementing-adadelta/
http://www.cnblogs.com/neopenx/p/4768388.html
http://downhill.readthedocs.io/en/stable/index.html
http://colinraffel.com/wiki/stochastic_optimization_techniques
BN
http://shuokay.com/2016/05/28/batch-norm/
http://colinraffel.com/wiki/stochastic_optimization_techniques