Closed udani969 closed 8 years ago
Or you could simply use the following fork function to make 2 copies of your merged layer:
def fork (model, n=2):
forks = []
for i in range(n):
f = Sequential()
f.add (model)
forks.append(f)
return forks
left = Sequential()
left.add(LSTM(output_dim=hidden_units, init='uniform', inner_init='uniform',
forget_bias_init='one', return_sequences=True, activation='tanh',
inner_activation='sigmoid', input_shape=(99, 13)))
right = Sequential()
right.add(LSTM(output_dim=hidden_units, init='uniform', inner_init='uniform',
forget_bias_init='one', return_sequences=True, activation='tanh',
inner_activation='sigmoid', input_shape=(99, 13), go_backwards=True))
model = Sequential()
model.add(Merge([left, right], mode='sum'))
#Add second Bidirectional LSTM layer
left, right = fork(model)
left.add(LSTM(output_dim=hidden_units, init='uniform', inner_init='uniform',
forget_bias_init='one', return_sequences=True, activation='tanh',
inner_activation='sigmoid'))
right.add(LSTM(output_dim=hidden_units, init='uniform', inner_init='uniform',
forget_bias_init='one', return_sequences=True, activation='tanh',
inner_activation='sigmoid', go_backwards=True))
#Rest of the stuff as it is
model = Sequential()
model.add(Merge([left, right], mode='sum'))
model.add(TimeDistributedDense(nb_classes))
model.add(Activation('softmax'))
sgd = SGD(lr=0.1, decay=1e-5, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer=sgd)
print("Train...")
model.fit([X_train, X_train], Y_train, batch_size=1, nb_epoch=nb_epoches, validation_data=([X_test, X_test], Y_test), verbose=1, show_accuracy=True)
It would be better to use the Bidirectional wrapper or the Graph
for this sort of stuff.
Wow it worked. I used the fork method, because it said some checks were not successful under the Wrapper approach. Just now only I could get it to work. Thanks a lot for the support.
@farizrahman4u I use your code as above and get a model. but When I load the model and test , I got error as follow:
File "BLSTM_NER.py", line 1058, in
test() File "BLSTM_NER.py", line 1038, in test ner.rnn_test(resfile,model_file,weights) File "BLSTM_NER.py", line 943, in rnn_test out = model.predict([self.X_test,self.X_test],batch_size=batch_size) File "/usr/local/lib/python2.7/dist-packages/keras/models.py", line 693, in predict return self._predict_loop(self._predict, X, batch_size, verbose)[0] File "/usr/local/lib/python2.7/dist-packages/keras/models.py", line 356, in _predict_loop batch_outs = f(ins_batch) File "/usr/local/lib/python2.7/dist-packages/keras/backend/theano_backend.py", line 448, in call return self.function(*inputs) File "/home/cl/download/Theano/theano/compile/function_module.py", line 845, in call self.inv_finder[c])) TypeError: Missing required input: <TensorType(float32, 3D)>
my code of test is as follow:
print "load model" model = model_from_json(open(my_model).read()) model.load_weights(weights) print "load model finish" out = model.predict([self.X_test,self.X_test],batch_size=batch_size)
How I got this error ? can you help me ? thanks~
i was trying @farizrahman4u example of deep bidirectional LSTM for my dataset whic has 50000 rows and 20 columns(19 features and 1 class label) and
X_train = sequence.pad_sequences(X_train, maxlen=100) X_test = sequence.pad_sequences(X_test, maxlen=100)
I am getting the following error. i know it is because of dimension shape in model.fit function but i dont know how to resolve this.
The problem is with the shape of your input data. The error message is pretty clear, lstm needs 3d data, but you are providing it 2d. The example I provided above is obsolete, use the functional api instead.
@farizrahman4u When you say "functional API", what do you mean exactly?
I saw this syntax here:
model.add(Bidirectional(LSTM(10, input_shape=(5, 10), return_sequences=True)))
But I don't know which package to import the Bidirectional()
class from
and this syntax here:
backwards = LSTM(64, go_backwards=True)(embedded)
But then I'm not exactly sure how to make a multi-layer biridectional LSTM (use the forking approach you described above on Feb 3rd?)
P.S. I want many-to-many sequence labelling, so where do I need to put the return_sequences=True
flags?
Google for Keras functional api. The bidirectional wrapper is from my seq2seq library.
@farizrahman4u Oh it's part of the seq2seq library I see.
Is this the correct usage to make a 2-layer bidirectional LSTM to output a category prediction for every input character?
Input chars are 43-dimensional, and there are 5 possible output categories.
from keras.models import Sequential
from keras.layers import Activation, LSTM, Merge, TimeDistributedDense
from keras.optimizers import SGD
def fork (model, n=2):
forks = []
for i in range(n):
f = Sequential()
f.add (model)
forks.append(f)
return forks
# First bidirectional LSTM layer
forward = Sequential()
forward.add(LSTM(output_dim=512, input_shape=(50, 43), return_sequences=True))
backward = Sequential()
backward.add(LSTM(output_dim=512, input_shape=(50, 43), return_sequences=True, go_backwards=True))
model = Sequential()
model.add(Merge([forward, backward], mode='concat'))
# Second bidirectionl LSTM layer
forward_2, backward_2 = fork(model)
forward_2.add(LSTM(output_dim=512, input_shape=(50, 512), return_sequences=True))
backward_2.add(LSTM(output_dim=512, input_shape=(50, 512), return_sequences=True, go_backwards=True))
model = Sequential()
model.add(Merge([forward_2, backward_2], mode='concat'))
# Softmax decision layer
model.add(TimeDistributedDense(output_dim=5))
model.add(Activation('softmax'))
# Optimizer function
sgd = SGD(lr=0.1, decay=1e-5, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer=sgd)
print("Train...")
model.fit([X_train, X_train], Y_train, batch_size=1, nb_epoch=nb_epoches, validation_data=([X_test, X_test], Y_test), verbose=1, show_accuracy=True)
Also, For this type of architecture, do the inputs have to "overlap" like so:
x_0 = [0, 1, 2, 3, 4], y_0 = [A, B, C, D, E]
x_1 = [1, 2, 3, 4, 5], y_1 = [B, C, D, E, F]
x_2 = [2, 3, 4, 5, 6], y_2 = [C, D, E, F, G]
or not overlap like so:
x_0 = [0, 1, 2, 3, 4], y_0 = [A, B, C, D, E]
x_1 = [5, 6, 7, 8, 9], y_1 = [F, G, H, I, J]
x_2 = [10, 11, 12, 13, 14], y_2 = [K, L, M, N, O]
@farizrahman4u before posting it i know the error i am getting because of dimension problem. I have train data set which is of size 390321 and 23 classes and test data set 20000 (i have correct label also which has 40) I am loading train, test and correct label data set and i am trying to apply deep bidirectional stateful lstm.
train data set size is 390321_41 (40 features and another one is class label) test data set size is 20000_40 corrected label size is 20000*1
how to reshape the dimension and apply to deep bidirectional stateful lstm?
@farizrahman4u @9thDimension when running lstm in the reverse direction, shouldn't the output corresponds to inputn, input{n-1}, input_{n-2}, ..., input_1? In that case, when concatenating with the output from the forward direction, we should reverse it?
@strin I have added the Bidirectional wrapper to Keras.. set the bidirectional lstm example.
Official manual can be referenced here, https://keras.io/layers/wrappers/#bidirectional
I'm afraid that the Bidirectional Wrapper will not work in Keras Functional Api. Any help in this sort of thing:
main_input = Input(shape=(100,), dtype='int32', name='main_input')
x = Embedding(output_dim=512, input_dim=10000, input_length=100)(main_input)
lstm = LSTM(32)(x)
bidirectional = Bidirectional()(lstm) #how bidirectional should be instantiated?
@grafael
How about this? Bidirectional has a layer at first arg.
bidirectional = Bidirectional(LSTM(32))(x)
Doesn't the 'go_backwards' option reverse the output order too? so model.add(Merge([left, right], mode='sum'))
does not make sense (you must flip one of them before adding)?
@ylmeng Yes, it is handled automatically. You don't have to flip it before merging as far as i know.
I am trying to implement a LSTM based speech recognizer. So far I could set up bidirectional LSTM (i think it is working as a bidirectional LSTM) by following the example in Merge layer. Now I want to try it with another bidirectional LSTM layer, which make it a deep bidirectional LSTM. But I am unable to figure out how to connect the output of the previously merged two layers into a second set of LSTM layers. I don't know whether it is possible with Keras. Hope someone can help me with this.
Code for my single layer bidirectional LSTM is as follows
Dimensions of my x and y values are as follows.
(100, 'train sequences') (20, 'test sequences') ('X_train shape:', (100, 99, 13)) ('X_test shape:', (20, 99, 13)) ('y_train shape:', (100, 99, 11)) ('y_test shape:', (20, 99, 11))