Closed xun468 closed 5 years ago
I apologize for the very slow response. I hope you have solved the problem.
The problematic line in your code is return_sequences=True
. This makes the encoder's result be a sequence instead of a single vector.
You can check it by:
encoder = Embedding(input_dict_size, 64, input_length=INPUT_LENGTH, mask_zero=True)(encoder_input)
encoder = Bidirectional(LSTM(64, return_sequences=True))(encoder)
print(encoder.get_shape()) # => (?, ?, 128)
encoder = Embedding(input_dict_size, 64, input_length=INPUT_LENGTH, mask_zero=True)(encoder_input)
encoder = Bidirectional(LSTM(64,))(encoder)
print(encoder.get_shape()) # => (?, 128)
This following code work for me:
encoder = Embedding(input_dict_size, 64, input_length=INPUT_LENGTH, mask_zero=True)(encoder_input)
encoder = Bidirectional(LSTM(64,))(encoder)
decoder = Embedding(output_dict_size, 64, input_length=OUTPUT_LENGTH, mask_zero=True)(decoder_input)
decoder = LSTM(128, return_sequences=True)(decoder, initial_state=[encoder, encoder])
decoder = TimeDistributed(Dense(output_dict_size, activation="softmax"))(decoder)
Hey! I've been playing around with your model and I'd like to modify the LSTM encoder into a bidirectional LSTM.
I am getting the error
However when I try initial_state=[encoder,encoder]) I get a very long error that ends in a shape mismatch. If it is not too much trouble, could I have your thoughts on how to properly implement this?