Implementation of CNN+ bidirectional LSTM for videos

dakshvar22 commented 8 years ago

I have 25 videos of 2700 frames(RGB images of size 32x32) each. How do I build my model to take in video input for the CNN and further the time step for LSTM would be the number of frames. I implemented the bidirectional LSTM in the following method:

`model = Sequential() left = Sequential() left.add(LSTM(output_dim=256, init='uniform', inner_init='uniform', forget_bias_init='one', return_sequences=True, activation='tanh', inner_activation='sigmoid', input_shape=(32, 32))) right = Sequential() right.add(LSTM(output_dim=256, init='uniform', inner_init='uniform', forget_bias_init='one', return_sequences=True, activation='tanh', inner_activation='sigmoid', input_shape=(32, 32), go_backwards=True))

model.add(Merge([left, right],'sum'))`

This gives me an error : Exception: Merge can only be called on a list of tensors, not a single tensor. Received: Elemwise{switch,no_inplace}.0 @fchollet Could you please help me with this?

joelthchao commented 8 years ago

Typo? model.add(Merge([left, right],'sum'))

dakshvar22 commented 8 years ago

@joelthchao Sorry, that was a typo. Missed the last bracket here. Any suggestions?

tboquet commented 8 years ago

I'm not sure if it will fit your needs but you can try the TimeDistributed wrapper. You could also use a pretrained VGG, freeze the weights and fine tune the lstm part. For the error, do you have a fully reproducible example (model + toy dataset)?

joelthchao commented 8 years ago

@dakshvar22 I correct the typo and no exception is produced.

dakshvar22 commented 8 years ago

@joelthchao , I still can't get rid of the exception. This is my code for building the model -

`def buildModel(): model = Sequential() model.add(Convolution3D(32, 3, 3,3, border_mode='same',dim_ordering = 'th',input_shape=(2721,img_channels, img_rows, img_cols))) model.add(keras.layers.normalization.BatchNormalization()) model.add(Activation('relu')) model.add(MaxPooling3D(pool_size=(2, 2, 2))) model.add(Dropout(0.25))

model.add(Convolution3D(32, 3, 3,3, border_mode='same'))
model.add(keras.layers.normalization.BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling3D(pool_size=(2, 2, 2)))
model.add(Dropout(0.25))

model.add(Convolution3D(32, 3, 3,3, border_mode='same'))
model.add(keras.layers.normalization.BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling3D(pool_size=(2, 2, 2)))
model.add(Dropout(0.25))

left = Sequential()
left.add(LSTM(output_dim=256, init='uniform', inner_init='uniform',
               forget_bias_init='one', return_sequences=True, activation='tanh',
               inner_activation='sigmoid', input_shape=(32, 32)))
right = Sequential()
right.add(LSTM(output_dim=256, init='uniform', inner_init='uniform',
           forget_bias_init='one', return_sequences=True, activation='tanh',
           inner_activation='sigmoid', input_shape=(32, 32), go_backwards=True))
model.add(Merge([left, right],'sum'))

model.add(Flatten())
model.add(Dense(4096,activation='relu'))
model.add(Dense(1024,activation='relu'))
model.add(Dense(512,activation='relu'))

model.add(Dense(1,activation='sigmoid'))

sgd = SGD(lr = 0.1, decay = 1e-5, momentum=0.9, nesterov=True)
model.compile(loss='mean_absolute_error',
          optimizer=sgd,
          metrics=['accuracy'])`

Do you see anything wrong? I am using Keras version 1.0.2

dakshvar22 commented 8 years ago

@tboquet sorry, I do not have a reproducible example. I have pasted my code which I use to build the model in the above comment. Do you see anything wrong? I am using Keras version 1.0.2

joelthchao commented 8 years ago

@dakshvar22 I think the problem comes from your MaxPooling3D layer. Won't they destroy your img_channels or actually you have img_channels >= 4?

dakshvar22 commented 8 years ago

@joelthchao My image_channels is equal to 3(RGB). I have 25 videos of 2700 frames(RGB images of size 32x32) each. I need to input the whole video(all frames) at once into the network.

joelthchao commented 8 years ago

@dakshvar22 Sorry I only have TensorFlow and it doesn't support Convolution3D, therefore I can not test it. Just to remind that input for conv3d has shape (samples, channels, conv_dim1, conv_dim2, conv_dim3). In your first conv3d, probably you have turned video_length into 32. Make sure your convolutional layer operate on the right axes.

dakshvar22 commented 8 years ago

@anayebi Can you help me out with this? I have even posted this question on your library. I have the latest version of Keras installed on my system - Version 1.0.2

anayebi commented 8 years ago

If you take a look at the Readme for Keras-extra, it explicitly states that the code won't work for versions of Keras later than 0.3.0.

If you have Version 1.0.2, you can just use the TimeDistributed wrapper around Convolution2D, as explained here: http://keras.io/layers/wrappers/

Hope that helps! :)

dakshvar22 commented 8 years ago

@tboquet @fchollet Update: I am running my code on Keras 1.0.2 .. I get the an error when I use the following code which is almost similar to the example code given for TimeDistributed Wrapper model.add(TimeDistributed(Convolution2D(8, 4, 4, border_mode='valid', batch_input_shape=(2710,3,32,32)))) model.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2),border_mode='valid'))) model.add(Activation('relu')) model.add(TimeDistributed(Convolution2D(8, 3, 3, border_mode='valid'))) model.add(Activation('relu')) model.add(TimeDistributedFlatten()) model.add(Activation('relu')) model.add(GRU(output_dim=100,return_sequences=True)) model.add(GRU(output_dim=50,return_sequences=False)) model.add(Dropout(.2)) model.add(Dense(1))

I get the following error -

File "/home/toothless/daksh/ml/project/src/model/model2.py", line 133, in buildModel model.add(TimeDistributed(Convolution2D(8, 4, 4, border_mode='valid', input_shape=(2710,3,32,32)))) File "/usr/local/lib/python2.7/dist-packages/keras/models.py", line 102, in add raise Exception('The first layer in a Sequential model must ' Exception: The first layer in a Sequential model must get an input_shape or batch_input_shape argument.

I have initialised my model with - model = Sequential() ... I have specified the input shape already, but still it gives the error while building the model itself. I am\ not **even using model.fit right now. Please help me out.

dakshvar22 commented 8 years ago

Another update: I figured out there was a paranthesis issue in the first layer declaration. It should have been this -

model.add(TimeDistributed(Convolution2D(8, 4, 4, border_mode='valid'), input_shape=(2710,3,32,32))) Still, I get this new error -

File "/home/toothless/daksh/ml/project/src/model/model2.py", line 133, in buildModel model.add(TimeDistributed(Convolution2D(8, 4, 4, border_mode='valid'), input_shape=(2710,3,32,32))) File "/usr/local/lib/python2.7/dist-packages/keras/models.py", line 110, in add layer.create_input_layer(batch_input_shape, input_dtype) File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 337, in create_input_layer dtype=input_dtype, name=name) File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 1026, in Input name=name, input_dtype=dtype) File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 957, in init shape = input_tensor._keras_shape AttributeError: 'TensorVariable' object has no attribute '_keras_shape'

Any idea what would this new error mean? - @joelthchao @tboquet @anayebi

dakshvar22 commented 8 years ago

@fchollet @joelthchao @anayebi @jamesfm I have finally been able to connect the convolution layers with one LSTM layer. But, I want to implement a bi directional layer as mentioned in the first post of this thread. Here is the code that I am using now finally to build the model -

` def buildModel():

model.add(TimeDistributed(Convolution2D(32, 3, 3,
                                    border_mode='valid'),input_shape=(1361,img_channels, img_rows, img_cols)))
model.add(keras.layers.normalization.BatchNormalization())
model.add(Activation('relu'))
model.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2))))
model.add(Dropout(0.25))

model.add(TimeDistributed(Convolution2D(32, 3, 3, border_mode='valid')))
model.add(keras.layers.normalization.BatchNormalization())
model.add(Activation('relu'))
model.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2))))
model.add(Dropout(0.25))

model.add(TimeDistributed(Convolution2D(32, 3, 3, border_mode='valid')))
model.add(keras.layers.normalization.BatchNormalization())
model.add(Activation('relu'))
model.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2))))
model.add(Dropout(0.25))

model.add(TimeDistributed(Flatten()))
model.add(TimeDistributed(Dense(256,activation='relu')))
model.add(TimeDistributed(Dense(128,activation='relu')))

# model.add(LSTM(output_dim=255,return_sequences=True))

left = Sequential()
left.add(LSTM(output_dim=256, init='uniform', inner_init='uniform',
               forget_bias_init='one', return_sequences=True, activation='tanh',
               inner_activation='sigmoid', input_shape = (1,128)))
right = Sequential()
right.add(LSTM(output_dim=256, init='uniform', inner_init='uniform',
           forget_bias_init='one', return_sequences=True, activation='tanh',
           inner_activation='sigmoid', input_shape = (1,128), go_backwards=True))

l = list()
l.append(left)
l.append(right)
model.add(Merge(l,'sum'))

model.add(TimeDistributed(Dense(1,activation='sigmoid')))

sgd = SGD(lr = 0.1, decay = 1e-5, momentum=0.9, nesterov=True)
model.compile(loss='mean_absolute_error',
          optimizer=sgd,
          metrics=['accuracy'])

`

I get a strange error while merging the two LSTM layers -

Exception: Merge can only be called on a list of tensors, not a single tensor. Received: Reshape{3}.0

I am sure that the individual layer instances are created correctly. What could go wrong here?

PMitura commented 8 years ago

Hello, any updates or insights on this issue? I'm having similar problems using following code to build bidirectional GRU:

    model.add(TimeDistributed(Dense(int(TD_LAYER_MULTIPLIER * (alphaSize +
        nomiSize)), activation = 'tanh'),
        input_shape = (None, alphaSize + nomiSize)))

    fwModel = Sequential()
    fwModel.add(GRU(int(GRU_LAYER_MULTIPLIER * alphaSize),
            activation = 'sigmoid',
            input_shape = (None, alphaSize + nomiSize)))

    bwModel = Sequential()
    bwModel.add(GRU(int(GRU_LAYER_MULTIPLIER * alphaSize), 
            activation = 'sigmoid', go_backwards = True,
            input_shape = (None, alphaSize + nomiSize)))

    model.add(Merge([fwModel, bwModel], mode = 'sum'))

This is the exception I'm getting:

Traceback (most recent call last):
  File "./run.py", line 22, in <module>
    main(sys.argv[1:])
  File "./run.py", line 19, in main
    run(source)
  File "/home/peter/Projects/smiles-nn/smiles-neural-network/rnn/rnn.py", line 319, in run
    model = setup(alphaSize, nomiSize)
  File "/home/peter/Projects/smiles-nn/smiles-neural-network/rnn/rnn.py", line 97, in setup
    return configureModel(alphaSize, nomiSize)
  File "/home/peter/Projects/smiles-nn/smiles-neural-network/rnn/rnn.py", line 85, in configureModel
    model.add(Merge([fwModel, bwModel], mode = 'sum'))
  File "/usr/local/lib/python2.7/dist-packages/keras/models.py", line 146, in add
    output_tensor = layer(self.outputs[0])
  File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 1245, in __call__
    'not a single tensor. Received: ' + str(inputs))
Exception: Merge can only be called on a list of tensors, not a single tensor. Received: Reshape{3}.0

I'm using Keras 1.0.5 and Theano 0.8.2 Thanks for any kind of help!

sarschu commented 8 years ago

Exactly the same here. Any idea anyone?

kfzn commented 8 years ago

Try 'merge' instead of 'Merge', "Merge is for layers, merge is for tensors. " -- see #2467 for details.

Funny thing is, 'merge' and 'Merge' are not the same, and I got confused too.

7lagrange commented 8 years ago

same here. It is not a problem of merge/Merge, I am using a layer as the input to Merge.

error is This gives me an error : Exception: Merge can only be called on a list of tensors, not a single tensor. Received: Elemwise{switch,no_inplace}.0

pranaymanocha commented 8 years ago

Any progress @PMitura @dakshvar22 @sarschu ? I am trying to implement the same thing but am not able to integrate the CNN with the Bidirectional LSTM. I have created 2 models separately but am not able to join them together. Any views ?

sarschu commented 8 years ago

I think in the end it wasn't the CNN. It was some problem with the input shapes. Since it worked forward without any complains I built a second net that looks exactly the same and put it through a backwards layer. Aftet that I could merge. Kind of a dirty solution though.... On Sep 3, 2016 3:09 PM, "pranaymanocha" notifications@github.com wrote:

Any progress @PMitura https://github.com/PMitura @dakshvar22 https://github.com/dakshvar22 @sarschu https://github.com/sarschu ? I am trying to implement the same thing but am not able to integrate the CNN with the Bidirectional LSTM. I have created 2 models separately but am not able to join them together. Any views ?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/fchollet/keras/issues/2646#issuecomment-244561410, or mute the thread https://github.com/notifications/unsubscribe-auth/AJkBJbqILrXU-gVvDOl0yqRIt1eMlL3xks5qmbe_gaJpZM4IY5c_ .

PMitura commented 8 years ago

Newly added Bidirectional wrapper (doc) (pull request) seems to solve this issue. It is working fine for me now.

stale[bot] commented 7 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs, but feel free to re-open it if needed.

keras-team / keras

Implementation of CNN+ bidirectional LSTM for videos #2646