EderSantana / seya

Bringing up some extra Cosmo to Keras.
Other
377 stars 103 forks source link

Giving Bidirectional RNN an 'input_shape' parameter #12

Closed NickShahML closed 8 years ago

NickShahML commented 8 years ago

Hey @EderSantana,

Thanks again for making the bidirectional rnn. You give a great example of how to use it here: https://github.com/EderSantana/seya/blob/master/examples/imdb_brnn.py

However, if we are not using an embedding layer, I was wondering how we can pass the "input_shape" parameter into the bidirectional lstm. One way is to add a regular GRU before we add the bidirectional lstm. However, it would be great if we could use the bidrectional rnn as our first layer:

This Works:

print 'constructing Bidirectional RNN'
lstm = LSTM(output_dim=hidden_variables_encoding/2, return_sequences = False)
gru = GRU(output_dim=hidden_variables_encoding/2, return_sequences = False)  # original examples was 128, we divide by 2 because results will be concatenated

print 'constructing readout GRU'
readout = Sequential()
readout.add(Dense(y_matrix_axis, input_shape = (hidden_variables_decoding,), activation='softmax'))
gru_wr = GRUwithReadout(readout, return_sequences = True)

print '-------------------------------------Constructing Model ---------------------------------'
model = Sequential()
model.add(GRU(hidden_variables_encoding, input_shape = (x_maxlen, word2vec_dimension), return_sequences=True))
model.add(Bidirectional(forward=lstm, backward=gru, return_sequences = False))
model.add(Dense(hidden_variables_encoding))
model.add(Activation('relu'))
model.add(RepeatVector(y_maxlen))
for z in range(0,number_of_decoding_layers-1):
    model.add(GRU(hidden_variables_decoding, return_sequences=True))
    model.add(Dropout(dropout))
model.add(gru_wr)
model.compile(loss='categorical_crossentropy', optimizer='adam')

This does not work, but would be awesome if it could work:

lstm = LSTM(output_dim=hidden_variables_encoding/2, input_shape = (x_maxlen, word2vec_dimension), return_sequences = False)
gru = GRU(output_dim=hidden_variables_encoding/2, input_shape = (x_maxlen, word2vec_dimension), return_sequences = False) 
print 'constructing readout GRU'
readout = Sequential()
readout.add(Dense(y_matrix_axis, input_shape = (hidden_variables_decoding,), activation='softmax'))
gru_wr = GRUwithReadout(readout, return_sequences = True)

print '-------------------------------------Constructing Model ---------------------------------'
model = Sequential()
model.add(Bidirectional(forward=lstm, backward=gru, return_sequences = False))
model.add(Dense(hidden_variables_encoding))
model.add(Activation('relu'))
model.add(RepeatVector(y_maxlen))
for z in range(0,number_of_decoding_layers-1):
    model.add(GRU(hidden_variables_decoding, return_sequences=True))
    model.add(Dropout(dropout))
model.add(gru_wr)
model.compile(loss='categorical_crossentropy', optimizer='adam')

When ran: it yields the following error on model.add(Dense(hidden_variables_encoding)):

Exception: Layer is not connected. Did you forget to set "input_shape"?

Thanks alot again!

NickShahML commented 8 years ago

Actually I just figured out that you can add a fake masking layer (mask = 9999) and you can use input shape within the masking layer. I think you can also do brnn.input_shape = (). So I'm gonna close this!