nicodjimenez / lstm

Minimal, clean example of lstm neural network training in python, for learning purposes.
1.72k stars 654 forks source link

Remove incorrect var `top_diff_s` in derivative #49

Closed tonyzhang617 closed 2 years ago

tonyzhang617 commented 2 years ago

The derivative ds was calculated incorrectly. I removed the wrong variable top_diff_s in this PR. After removing the variable and testing the example, the loss is significantly lower than before after training.

Output Before Change: iter 99: y_pred = [-0.50033, 0.20106, 0.09912, -0.49923], loss: 2.611e-06 Output After Change: iter 99: y_pred = [-0.49995, 0.19999, 0.10001, -0.50006], loss: 6.078e-09

nicodjimenez commented 2 years ago

Are you sure about this? I haven't looked at this in a while. Doesn't top_diff_s need to be propagated recursively along the constant error carousel?

nicodjimenez commented 2 years ago

Forward equation:

image

Derivatives:

image

image

image

I don't see where the math is wrong.

More details from the blog post:

https://nicodjimenez.github.io/2014/08/08/lstm.html image