pluskid / Mocha.jl

Deep Learning framework for Julia
Other
1.29k stars 254 forks source link

Fixes for SGD Solver and enhancements for Adam #102

Closed benmoran closed 9 years ago

benmoran commented 9 years ago

Yesterday's PR accidentally removed the dynamic updates to learning rate and momentum for the SGD and Nesterov solvers - this puts them back.

There are also some Adam improvements: it can also benefit from the LRPolicy, so this is added. I also found that it was useful to serialize the solver state after all, if I restarted after optimization and reset the moment estimates the solver often diverged. This version saves the whole solver state (at the cost of making the snapshots 3x bigger when Adam is used).