Open lhl881210 opened 7 years ago
My understanding is that dense2 = Dense(output_dim)
simply facilitates the transfer of encoder output to the decoder with correct dimensions - since decoder possibly has different dimensions. That said you're right, the encoder output should be of hidden_dim
since it is the final hidden state which represents the embedding. The encoder is supposed to end at the layer before dense2
. I think the confusion results from the fact that the output of dense2
is later called encoded_seq
.
Hi, @abhaikollara and @farizrahman4u
I fund the output of encoded is
encoded_seq = dense2(encoded_seq)
, where 'dense2 = Dense(output_dim)'.In my opinion,the dimension of encoded_seq maybe is hidden_dim.
For example, when input_dim=100, hidden_dim=200, output_dim=4, if
dense2 = Dense(output_dim)
, then the changes of dimensionality will be 100->200->4->200->4. In this case, maybe the decoder will over-learn easily. So, maybedense2= Dense(hidden_dim)
is better in this case.Maybe I am not quite correct, but I want to know your thinking.
Thanks.