NANs in Adadelta - Githubissues

accosmin / nano

C++ library [machine learning & numerical optimization] - superseeded by libnano

MIT License

1 stars 0 forks source link

NANs in Adadelta #122

Closed accosmin closed 8 years ago

accosmin commented 8 years ago

Adadelta produces nans when training convolution networks on MNIST.

accosmin commented 8 years ago

This appear for other stochastic methods (e.g. Adam or AdaGrad) and they seem related to the "epsilon" parameter being too high when normalizing the weighted gradient. However these configurations are correctly pruned during tuning.