Closed keineahnung2345 closed 5 years ago
@keineahnung2345 Thank you for your report. You are right, it's a bug.
Actually, that is not a bug. I think we will get the same behavior by putting a dense layer after the RNN layer. I need to refresh my understanding of RNNs.
My implementation is flexible and also supports multi-sequence output. https://github.com/rushter/MLAlgorithms/blob/6e383f73e87ff1afb62ff4d711e4d8dd245ae923/mla/neuralnet/layers/basic.py#L149
The lstm
implementation follows the same approach.
Here is an example of how to get the same RNN formula by adding a Dense layer: https://github.com/rushter/MLAlgorithms/blob/6e383f73e87ff1afb62ff4d711e4d8dd245ae923/examples/nnet_rnn_text_generation.py#L43
I see, so RNN layer should always be used with Dense layer, right?
After reviewing Andrew Ng's Deep RNN lecture, I found the RNN layer in the bottom just returns states(a
), and only the last RNN layer returns y
. So the implementation should be correct, thanks!
According to the Andrew Ng's deep learning course (a for hidden state, y for output value): We get output values by multiplying the hidden state by a weight matrix Wya, adding bias by onto it, and then go through an activation function.
But from https://github.com/rushter/MLAlgorithms/blob/6e383f73e87ff1afb62ff4d711e4d8dd245ae923/mla/neuralnet/layers/recurrent/rnn.py#L55-L63, it seems the hidden state is directly returned.
@rushter Can you please give your reference of RNN or confirm it as a bug? If it's a bug, I'd like to create a PR to fix it. 😄