Open mikechen66 opened 4 years ago
self.state.h = self.state.o * np.tanh(self.state.s)
@mikechen66 and also exists problem when do backpropagation, ignoring the derivation of tanh function
def top_diff_is(self, top_diff_h, top_diff_s):
# notice that top_diff_s is carried along the constant error carousel
ds = self.state.o * top_diff_h + top_diff_s
do = self.state.s * top_diff_h
di = self.state.g * ds
dg = self.state.i * ds
df = self.s_prev * ds
ds = self.state.o *(1-self.state.s^2)* top_diff_h + top_diff_s;
do = np.tanh(self.state.s) * top_diff_h
I think you're right some / most implementations use the tanh but that's not how I defined the forward pass in the blog article:
https://nicodjimenez.github.io/2014/08/08/lstm.html
If you want to make a PR to add that as an option, that's fine with me.
yes
Issue: lstm.py--the 98th line.
There is a problem with the code of line: self.state.h = self.state.s self.state.o. You forget the tanh function. The formula is h{t} = o{t} tanh(s_{t}). Therefore, the correct one is the line of code as follows.
self.state.h = tanh(self.state.s) * self.state.o
Pasted the partial lines of code as follows.