IDSIA / brainstorm

Fast, flexible and fun neural networks.
Other
1.3k stars 154 forks source link

MomentumStepper has error #96

Closed jzilly closed 8 years ago

jzilly commented 8 years ago

The momentum stepper should implement the following equations: v=gamma_v + alpha * g (g is gradient) theta = theta - v Instead the implementation does: v=gamma_v - alpha * g theta = theta + v.

In essence there is a sign error. Will submit a pull request soon.

flukeskywalker commented 8 years ago

Are you sure? See: http://machinelearning.wustl.edu/mlpapers/paper_files/icml2013_sutskever13.pdf or http://publications.idiap.ch/downloads/reports/1995/95-04.pdf

The way to understand it is that we'd normally have θ -> θ - αg. Instead, we want to do θ -> θ -αg + γv i.e. add a velocity vector. What we add at this step in total is the effective velocity (needed for next step), which is - αg + γv.

jzilly commented 8 years ago

Thank you Rupesh for the reply.

Looking closer at it I suppose the two ways are equivalent.

See you tomorrow.

Sent from my iPhone

On 16.11.2015, at 20:41, Rupesh Kumar Srivastava notifications@github.com wrote:

Are you sure? See: http://machinelearning.wustl.edu/mlpapers/paper_files/icml2013_sutskever13.pdf or http://publications.idiap.ch/downloads/reports/1995/95-04.pdf

The way to understand it is that we'd normally have θ -> θ - αg. Instead, we want to do θ -> θ -αg + γv i.e. add a velocity vector. What we add at this step in total is the effective velocity (needed for next step), which is - αg + γv.

— Reply to this email directly or view it on GitHub.