keras-team / keras

Deep Learning for humans
Apache License 2.0
61.58k stars 19.42k forks source link

Stateful LSTMs - error despite using "batch_input_shape" #2030

Closed anujgupta82 closed 7 years ago

anujgupta82 commented 8 years ago

When i add 'stateful' to LSTM, I get following Exception: If a RNN is stateful, a complete input_shape must be provided (including batch size). Based on other threads #1125 #1130 I am using the option of "batch_input_shape" yet i am getting the error. I raised the same in forum!topic/keras-users/nwB3ilYY4ZQ but no response

you can find my complete code here:

NasenSpray commented 8 years ago

batch_input_shape must be passed to the first layer of the network.

gibipara92 commented 8 years ago

Same problem here: model = Sequential() model.add(GRU(100,activation='relu',stateful=True,return_sequences=True,batch_input_shape=(batch_size,X_train.shape[-2], X_train.shape[-1]))) ...

santi-pdp commented 8 years ago

What's the X_train.shape[0] you have? do all your batches have the same number of samples? that is a must when using stateful RNNs.

gibipara92 commented 8 years ago

X_train.shape[0] is the number of samples.

I have just tried using a batch_size that is a factor of the number of samples (so all batches have exactly the same number of samples) and it works, thanks!

A note that mentions this might be helpful in the documentation where talking about statefulness. Edit: it's already there, my bad.

philipperemy commented 8 years ago

Have a look at this :

cmgladding commented 7 years ago

If batch_input_shape must be specified in the first layer of a stateful network, how is this done when using the functional API? The Input() layer will not allow it. I have tried everything I can think of but am still receiving this same exception ("complete input_shape must be provided (including batch size)"), even with batch size 1. I am trying to make an LRCN using TimeDistributed CNN layers, followed by a couple dense layers, followed by LSTM:

inputs = Input(shape=(1,3,227,227))

conv_1 = TimeDistributed(Convolution2D(96, 11, 11,subsample=(4,4),activation='relu',

conv_2 = TimeDistributed(MaxPooling2D((3, 3), strides=(2,2)))(conv_1)
conv_2 = TimeDistributed(LRN(name="convpool_1"))(conv_2)
conv_2 = TimeDistributed(ZeroPadding2D((2,2)))(conv_2)
conv_2 = TimeDistributed(Convolution2D(128,5,5,activation="relu",name="conv_2"))(conv_2)

...skipping similar conv layers...

dense_1 = TimeDistributed(MaxPooling2D((3, 3), strides=(2,2),name="convpool_5"))(conv_5)
dense_1 = TimeDistributed(Flatten(name="flatten"))(dense_1)
dense_1 = TimeDistributed(Dense(4096, activation='relu',name='dense_1'))(dense_1)
dense_2 = Dropout(0.5)(dense_1)
dense_2 = TimeDistributed(Dense(4096, activation='relu',name='dense_2'))(dense_2)

lstm_1 = Dropout(0.5)(dense_2)
lstm_1 = LSTM(100,
              batch_input_shape=(1,1,4096), #(batch size,timesteps,feature shape)

dense_3 = Dense(6,name='dense_out')(lstm_1)
prediction = Activation("tanh",name="tanh")(dense_3)

(This is a regression problem; I am trying to predict six values at each time step based on an image sequence.)

The same network using the other API does not produce the same exception, but I am hoping to take advantage of the functional API, so I'd like to figure out what I'm doing wrong.

cmgladding commented 7 years ago

Figured it out from - the error message is misleading. The "Input" function takes argument "batch_shape", not "batch_input_shape".

Ryan-fireball commented 7 years ago

@cmgladding I tried "batch_shape", but is was not recognized by Keras. Don't know why, The issues persist for me no matter what key words I used.

wulabs commented 7 years ago

I also have this issue on functional model

datlife commented 7 years ago

Here is my example for those who get stuck. Indeed, the error message is misleading. I had to change Input(shape=()) to Input(batch_shape=()) in order for it to work.

Error one:

frame_sequence = Input(shape=(TIME_STEPS, HEIGHT, WIDTH, CHANNELS))
net = TimeDistributed(self.vision_model)(frame_sequence)
net = LSTM(HIDDEN_UNITS, stateful=True, return_sequences=False)(net)

Correct version:

frame_sequence = Input(batch_shape=(BATCH_SIZE, TIME_STEPS, HEIGHT, WIDTH, CHANNELS))
net = TimeDistributed(self.vision_model)(frame_sequence)
net = LSTM(HIDDEN_UNITS, stateful=True, return_sequences=False)(net)
stale[bot] commented 7 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs, but feel free to re-open it if needed.

rajatkoner08 commented 6 years ago

Hi, I am facing almost same issue for a shared convnet, then two features are concatenated and to be feed to a Dense layer and then into a LSTM. Model is like fig.1

I am stucked in this issue for a long time, any help is much appricated. @farizrahman4u @dat-ai `# First, define the vision modules input_dim = (224, 224, 3) image_input = Input(shape=input_dim)

vision_model = Conv2D(64, (3, 3), activation='relu', padding='same')(image_input) vision_model = Conv2D(64, (3, 3), activation='relu')(vision_model) vision_model = MaxPooling2D((2, 2))(vision_model) vision_model = Conv2D(128, (3, 3), activation='relu', padding='same')(vision_model) vision_model = Conv2D(128, (3, 3), activation='relu')(vision_model) vision_model = MaxPooling2D((2, 2))(vision_model) vision_model = Conv2D(256, (3, 3), activation='relu', padding='same')(vision_model) vision_model= MaxPooling2D((2, 2))(vision_model) out = Flatten()(vision_model)

model = Model(image_input,out)

digit_a = Input(shape=input_dim) digit_b = Input(shape=input_dim)

The vision model will be shared, weights and all

out_a = model(digit_a) out_b = model(digit_b)

concatenated = concatenate([out_a, out_b]) out = Dense(2048, activation='relu')(concatenated)

concat_model= Model([digit_a,digit_b],out)

batch size =32,num of unroll =2, I am not sure how to put multi input sequence as input

frame_sequence = Input(batch_shape=(32, 2,224,224,3)) unroll_feature = TimeDistributed(concat_model)(frame_sequence)`

And the error message I got

Using TensorFlow backend. Traceback (most recent call last): File "/home/rajat/Downloads/pycharm-2017.2.3/helpers/pydev/", line 1599, in globals =['file'], None, None, is_module) File "/home/rajat/Downloads/pycharm-2017.2.3/helpers/pydev/", line 1026, in run pydev_imports.execfile(file, globals, locals) # execute the script File "/home/rajat/Downloads/re3-tensorflow-master/keras_training/", line 52, in unroll_feature = TimeDistributed(concat_model)(frame_sequence) File "/home/rajat/.local/lib/python2.7/site-packages/keras/engine/", line 602, in call output =, kwargs) File "/home/rajat/.local/lib/python2.7/site-packages/keras/layers/", line 188, in call unroll=False) File "/home/rajat/.local/lib/python2.7/site-packages/keras/backend/", line 2467, in rnn outputs, = step_function(inputs[0], initial_states + constants) File "/home/rajat/.local/lib/python2.7/site-packages/keras/layers/", line 179, in step output =, kwargs) File "/home/rajat/.local/lib/python2.7/site-packages/keras/engine/", line 2058, in call outputtensors, , _ = self.run_internal_graph(inputs, masks) File "/home/rajat/.local/lib/python2.7/site-packages/keras/engine/", line 2262, in run_internal_graph assert str(id(x)) in tensor_map, 'Could not compute output ' + str(x) AssertionError: Could not compute output Tensor("dense_1/Relu:0", shape=(?, 2048), dtype=float32)

madhuhegde commented 5 years ago

Hi @rajatkoner08, Please let me know if you were able to solve this problem and how to fix it. Thanks in advance. Madhu

mmehedin commented 5 years ago

When i add 'stateful' to LSTM, I get following Exception: If a RNN is stateful, a complete input_shape must be provided (including batch size). Based on other threads #1125 #1130 I am using the option of "batch_input_shape" yet i am getting the error. I raised the same in forum!topic/keras-users/nwB3ilYY4ZQ but no response

you can find my complete code here:

Needs to be built like so:

self.lstm_custom_1 = keras.layers.LSTM(128,batch_input_shape=batch_input_shape, return_sequences=False, stateful=True)

faltinl commented 5 years ago

Using Keras for R with a Functional API I am observing a similar problem which I can't resolve referring to the advice given above, since the cases above refer to Keras for Python and are (for me) not easily transferred to Keras for R.

Since this thread has been closed long before, I have raised this topic anew under issue #13262 - hopefully there will be replies w.r.t. to Keras for R.