Closed arryon closed 7 years ago
The tanh
function is applied in line 90.
hi, sorry to bother you again
I read this line in your code: self.state.h = self.state.s * self.state.o
but when I found in the paper and the picture, it saids may be like this:
self.state.h = np.tanh(self.state.s) * self.state.o
would you tell me which one is right?
In L95 of lstm.py, as far as I can see you are omitting to apply tanh() to the new cell state before multiplying it with the squashed o(t).
As referenced in the article you mention in your readme in the last equation on page 20, and in this excellent tutorial page I found (https://colah.github.io/posts/2015-08-Understanding-LSTMs/), you have to apply tanh() to your new cell state before you multiply it with o(t). I don't see you doing that in your code, so unless this is being corrected somewhere else I failed to notice, it should be corrected.
Otherwise, this is an excellent resource, thanks a lot :)