How to add bidirectional layer?

wanasit / katakana

Training machine to write Katakana using Sequence-to-Sequence technique

66 stars 18 forks source link

Hey! I've been playing around with your model and I'd like to modify the LSTM encoder into a bidirectional LSTM.

encoder_input = Input(shape=(input_length,))
decoder_input = Input(shape=(output_length,))

encoder = Embedding(input_dict_size, 64, input_length=input_length, mask_zero=True)(encoder_input)
encoder = Bidirectional(LSTM(UNITS, return_sequences=True))(encoder)

decoder = Embedding(output_dict_size, 64, input_length=output_length, mask_zero=True)(decoder_input)
decoder = LSTM(UNITS*2, return_sequences=True)(decoder, initial_state=[encoder])
decoder = TimeDistributed(Dense(output_dict_size, activation="softmax"))(decoder)

model = Model(inputs=[encoder_input, decoder_input], outputs=[decoder])
model.compile(optimizer='adam', loss='categorical_crossentropy')

return model

I am getting the error

ValueError: An initial_state was passed that is not compatible with cell.state_size. Received state_spec=[InputSpec(shape=(None, 50, 128), ndim=3)]; however cell.state_size is (128, 128)

However when I try initial_state=[encoder,encoder]) I get a very long error that ends in a shape mismatch. If it is not too much trouble, could I have your thoughts on how to properly implement this?

encoder = Embedding(input_dict_size, 64, input_length=INPUT_LENGTH, mask_zero=True)(encoder_input) encoder = Bidirectional(LSTM(64, return_sequences=True))(encoder) print(encoder.get_shape()) # => (?, ?, 128) encoder = Embedding(input_dict_size, 64, input_length=INPUT_LENGTH, mask_zero=True)(encoder_input) encoder = Bidirectional(LSTM(64,))(encoder) print(encoder.get_shape()) # => (?, 128)

encoder = Embedding(input_dict_size, 64, input_length=INPUT_LENGTH, mask_zero=True)(encoder_input) encoder = Bidirectional(LSTM(64,))(encoder) decoder = Embedding(output_dict_size, 64, input_length=OUTPUT_LENGTH, mask_zero=True)(decoder_input) decoder = LSTM(128, return_sequences=True)(decoder, initial_state=[encoder, encoder]) decoder = TimeDistributed(Dense(output_dict_size, activation="softmax"))(decoder)

wanasit / katakana

How to add bidirectional layer? #1