keras-team / keras

Deep Learning for humans
http://keras.io/
Apache License 2.0
61.88k stars 19.45k forks source link

Input 0 is incompatible with layer lstm_1: expected ndim=3, found ndim=4 #7403

Closed ajanaliz closed 3 years ago

ajanaliz commented 7 years ago

Here's the code I've written:


model.add(LSTM(150,
               input_shape=(64, 7, 339),
               return_sequences=False))
model.add(Dropout(0.2))

model.add(LSTM(
    200,
    return_sequences=True))
model.add(Dropout(0.2))

model.add(LSTM(
    150,
    return_sequences=True))
model.add(Dropout(0.2))

model.add(Dense(
    output_dim=1))
model.add(Activation('sigmoid'))

start = time.time()
model.compile(loss='mse', optimizer='rmsprop')
print('compilation time : ', time.time() - start)

model.fit(
    trainX,
    trainY_Buy,
    batch_size=64,
    nb_epoch=10,
    verbose=1,
    validation_split=0.05)

the error i'm getting is this: ValueError: Input 0 is incompatible with layer lstm_1: expected ndim=3, found ndim=4 on this line: model.add(LSTM(150, input_shape=(64, 7, 339), return_sequences=False))

my X shape is: (492, 7, 339) my Y shape is: (492,)

anyone have any ideas on what I'm doing wrong?

ajanaliz commented 7 years ago

same thing happens when I wrote the following for the first LSTM layer:

model.add(LSTM(150,
               input_shape=trainX.shape,
               return_sequences=False))
td2014 commented 7 years ago

@ajanaliz . I took a quick look, and I believe that you need to remove the leading "64" from the input shape of the LSTM layer --> input_shape=(64, 7, 339), --> input_shape=(7, 339). Keras' convention is that the batch dimension (number of examples (not the same as timesteps)) is typically omitted in the input_shape arguments. The batching (number of examples per batch) is handled in the fit call. I hope that helps. Thanks.

ajanaliz commented 7 years ago

@td2014 nope, that way my error is: Input 0 is incompatible with layer lstm_2: expected ndim=3, found ndim=2

td2014 commented 7 years ago

@ajanaliz . You may need to turn "return_sequences=True" in the first layer. Maybe that will solve it. I hope that works. Thanks.

ajanaliz commented 7 years ago

omg, thanks

junfeng-chen commented 7 years ago

Hi, I have the same bug as you. My X shape is (24443, 124, 30), y shape is (24443, 124). May be it's the shape of y that causes the error for me. May I know the type of your Y?

ajanaliz commented 7 years ago

ur y shape should be (24443,)

Sent from myMail for iOS

Wednesday, August 2, 2017, 6:08 PM +0430 from notifications@github.com notifications@github.com:

Hi, I have the same bug as you. My X shape is (24443, 124, 30), y shape is (24443, 124). May be it's the shape of y that causes the error for me. May I know the type of your Y? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub , or mute the thread .

junfeng-chen commented 7 years ago

I get an error even when I construct the same keras layer as yours... "Errors when checking target: expected activation_1 to have 3 dimensions, but got array with shape (24443, 1)"

ajanaliz commented 7 years ago

i think it has to do with ur lstm architecture and the way you're returning feedbacks

Sent from myMail for iOS

Thursday, August 3, 2017, 11:34 AM +0430 from notifications@github.com notifications@github.com:

I get an error even when I construct the same keras layer as yours... "Errors when checking target: expected activation_1 to have 3 dimensions, but got array with shape (24443, 1)" — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub , or mute the thread .

atishsawant commented 7 years ago

Having the same issue with the following code. Was there ever a fix for this?

model.add(LSTM(32, input_shape=(588425, 26), return_sequences = True)) model.compile(loss='mean_squared_error', optimizer='adam') model.fit(df_matrix, y, epochs=2, batch_size=1, verbose=2)

kicking off this error: ValueError: Input 0 is incompatible with layer lstm_23: expected ndim=3, found ndim=2

These are my inputs: df_matrix.shape = (1, 588425, 26)

I tried it as a 2-D arry as well and I get the same error. What am I doing wrong here? Running on Windows 10, latest Keras/TF backend

hadisaadat commented 7 years ago

Dear @td2014 could you give me a hand Actually the real problem is this I want to have 2 stacked lstm layers in the way that :

batch_size=900 Max_word_len=30 Max_sentence_length=#some custom number Char_embedding_Size=len(chars) print('Build model...') model = Sequential()

return_sequences=False to output at the end of each sequence

model.add(LSTM(batch_size, input_shape=(Max_word_len, Char_embedding_Size), return_sequences=False)) model.add(LSTM(batch_size/Max_word_len, input_shape=(Max_sentence_length, Max_word_len), return_sequences=True)) model.add(Dense(62)) model.add(Activation('tanh'))

Any hint suggestion and implementation in keras and tensorflow is welcome Thanks a lot again in advance.

Best regards,

Hadi.

td2014 commented 7 years ago

@hadisaadat . Some suggestions that might help provide some direction for you. I would suggest breaking up your problem into several pieces to make sure you understanding the input/output dimensions of each layer (reviewing this section will help https://keras.io/layers/recurrent/). Also, you have "batch_size" as the first argument of the LSTM call:

model.add(LSTM(batch_size, input_shape=(Max_word_len, Char_embedding_Size), return_sequences=False))

The first argument (which is called "units" in the documentation) is the output dimensionality of that layer. In order words, you will have "units" number of LSTM cells at that layer. In Keras, the batch dimension is not typically specified in the model architecture.

It might be useful to just have a single LSTM layer that feeds into a single Dense layer (with number of units=1, so Dense(1), not Dense(62)) and look at the architecture you get from model.summary()).

Also, the "return_sequences=True" for each sample, will generate output from each LSTM cell for each timestep. This is typically fed into a second RNN layer, not into a regular Dense layer. Also, Keras should automatically infer the input data shape for every layer except the first.

If you are trying to processing characters -> words, and then words --> sentences, it might be easier to create two separate models (each one with a single LSTM layer) instead of directly trying to stack, because your batch definition is changing between layers.

Also, as far as the final layer goes, that will depend on what your target is? In other words, what do you want your model to output. Are you trying to classify the type of sentence? From your above architecture, it seems you have 62 different outputs/classes you are trying to model. Depending on what your target is, this should help you decide on the final output layer.

I hope this gives you a few ideas to help. Thanks.

hadisaadat commented 7 years ago

Dear @td2014 thank you so much for your reply,

Actually yes Im doing some hierarchical stacked layers for text as you mentioned Chars->word and then word->vec[0..61] (a vector of real number )

If you are trying to processing characters -> words, and then words --> sentences

yes your right it is easier but I want to have this hierarchical structure to see even its effect in comparison with having 2 separate model as you suggested

it might be easier to create two separate models (each one with a single LSTM layer) instead of directly trying to stack

is it a big challenge to fit such an issue? I mean isn't it possible to have 2 stacked layer with relative batch_size by some option like None keyword

because your batch definition is changing between layers

specially here there are some unclear points let say I have such a sudo code

in the first lstm layer at charachter level I want the layer output only at the end of sequence which is the end on the word(let assume I have a padded ) so I set return_sequences=False to force it not to output for each input character only at the end, on the other hand I want the second layer receive input for each word and out put for each word as well, so return_sequences=True is for this layer; some questions raises here :

  1. what is really relation between the batch_size here with return_sequences=False/True ?

  2. if dont force the first layer to output at the end of seq it works but it's not what I need and when I force it, I receive the dimension error of second layer input ,how to solve it?

and my final out put is just a real number vector for each word it's not a class so I do not need a softmax like layer at the end to calculate the probability for each class, its like a regression to predict a real number but with dimension 62.

Also, as far as the final layer goes, that will depend on what your target is? In other words, what do you want your model to output. Are you trying to classify the type of sentence? From your above architecture, it seems you have 62 different outputs/classes you are trying to model. Depending on what your target is, this should help you decide on the final output layer.

However if its not possible with the keras architecture any suggestion in tensorflow is welcome. Finally I want to appreciate again for your attention and explanations.

Hadi.

td2014 commented 7 years ago

@hadisaadat . Here is a bit more information which might help, based on your info above:

in the first lstm layer at charachter level I want the layer output only at the end of sequence which is the end on the word(let assume I have a padded ) so I set return_sequences=False to force it not to output for each input character only at the end, on the other hand I want the second layer receive input for each word and out put for each word as well, so return_sequences=True is for this layer; some questions raises here :

what is really relation between the batch_size here with return_sequences=False/True ?

The basic layout is the following:

Batch_sample_1: timestep1, timestep2, ..., timestepN Batch_sample_2: timestep1, timestep2, ..., timestepN ... Batch_sample_BatchSize: timestep1, timestep2, ..., timestepN

When you set return_sequences=True, then for each batch_sample, you get outputs for each time step (And this will be one output for each LSTM cell in that layer).
When you set return_sequences=False, then for each batch_sample, you get the output for timestepN only.
In the second layer, since you are doing input and output for each word (which is from layer1), return_sequences=False is what you want I think (you only want the output from each input word, not the character that forms the word. For each sequence of timestep1...timestepN, you only have one word predicted).

Some more ideas based on using Masking and Merge Layers that might suggest some direction:

from keras.layers import LSTM, Input, Masking, multiply
from keras.models import Model

#
# Create input sequences
#
numTimesteps=20
slopeArray1=np.linspace(0, 10, num=numTimesteps)
slopeArray1 = np.expand_dims(slopeArray1, axis=0)
slopeArray1 = np.expand_dims(slopeArray1, axis=2)

slopeArray2=np.linspace(0, 15, num=numTimesteps)
slopeArray2 = np.expand_dims(slopeArray2, axis=0)
slopeArray2 = np.expand_dims(slopeArray2, axis=2)
maskArray=np.zeros((1,numTimesteps,1))
maskArray[0,numTimesteps-1]=1

X_train = np.concatenate((slopeArray1, slopeArray2))
X_mask = np.concatenate((maskArray, maskArray))

# preparing y_train
y_train = []
y_train = np.array([2*slopeArray1[0,19]-slopeArray1[0,18],
                    2*slopeArray2[0,19]-slopeArray2[0,18]]) # make target one delta higher

#
# Create model
#

inputs = Input(name='Input1', batch_shape=(1,numTimesteps,1))
X_mask_input = Input(name='Input2', batch_shape=(1,numTimesteps,1))
x = LSTM(units=1, name='LSTM1', return_sequences=True)(inputs)
x = multiply([x, X_mask_input])
x = Masking(mask_value=0.0)(x)
pred = LSTM(units=1, name='LSTM2', return_sequences=False, stateful=True)(x)
model = Model(inputs=[inputs, X_mask_input], outputs=pred)
model.compile(loss='mse', optimizer='sgd', metrics=['mse'])
print(model.summary())

#
# Train
# 
model.fit([X_train, X_mask], y_train, epochs=200, batch_size=1)

The idea is that if you can set a value (say 0.0) to be an "ignore" using the mask, then the second LSTM will only process on the final output, I think. More details here:https://keras.io/layers/core/

I hope this helps. Thanks.

stale[bot] commented 6 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.

karan10111 commented 6 years ago

model.add(LSTM(output_dim, input_shape=(tsteps, 1), batch_size=batch_size, return_sequences=True, stateful=True))

Giving the batch size explicitly solved the same for me.

rajasrajeev commented 6 years ago

Input 0 is incompatible with layer conv1d_14: expected ndim=3, found ndim=2

please help me with this

DIPTIMISHRA commented 6 years ago

encoder: map an input to its encoded representation

encoder = Model(input_img, encoded)

placeholder for an encoded input

encoded_input = Input(shape=(encoding_dim,)) print(encoded_input)

last layer of the autoencoder model

decoder_layer = autoencoder.layers[-1]

decoder

decoder = Model(encoded_input, decoder_layer(encoded_input))

I also got the error as

ValueError: Input 0 is incompatible with layer conv2d_46: expected ndim=4, found ndim=2

Please help me to solve this issue

tejasri19 commented 6 years ago

ValueError: Input 0 is incompatible with layer custom: expected min_ndim=3, found ndim=2

set up transfer learning on pre-trained ImageNet Inception_V3 model - remove fully connected layer and replace # with softmax for classifying 10 classes incepV3_model = InceptionV3(weights = 'imagenet', include_top = False, input_shape=(299,299,3)) x = incepV3_model.output x = Flatten(name='custom')(x) x = GlobalAveragePooling2D()(x) x = Dense(1024, activation='relu')(x) predictions = Dense(nb_classes, activation = 'softmax')(x) model = Model(input = incepV3_model.input, output = predictions)

i get error as Traceback (most recent call last): File "train.py", line 117, in x = Flatten(name='custom')(x) File "/home/isemes/anaconda3/envs/tensorflow-venv/lib/python3.6/site-packages/keras/engine/base_layer.py", line 414, in call self.assert_input_compatibility(inputs) File "/home/isemes/anaconda3/envs/tensorflow-venv/lib/python3.6/site-packages/keras/engine/base_layer.py", line 327, in assert_input_compatibility str(K.ndim(x))) ValueError: Input 0 is incompatible with layer custom: expected min_ndim=3, found ndim=2

fasecity commented 5 years ago

For those with this error make dont include the top in the base model base_model = tf.keras.applications.inception_v3.InceptionV3(weights='imagenet', include_top=False)

ZisisFl commented 5 years ago

I have received multiple diffrent ValueErrors trying to solve this changed many parameters. It is a time series problemI have data from 60 shops from 215 items in 1034 days. I have splitted 973 days for train and 61 for test:

train_x = train_x.reshape((60, 973, 215))
test_x = test_x.reshape((60, 61, 215))
train_y = train_y.reshape((60, 973, 215))
test_y = test_y.reshape((60, 61, 215))

My model:

model = Sequential()
model.add(LSTM(100, input_shape=(train_x.shape[1], train_x.shape[2]), return_sequences='true'))
model.add(Dense(215))
model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy'])
history = model.fit(train_x, train_y, epochs=10,
                    validation_data=(test_x, test_y), verbose=2, shuffle=False)

ValueError: Error when checking input: expected lstm_1_input to have shape (973, 215) but got array with shape (61, 215)

fasecity commented 5 years ago

I have received multiple diffrent ValueErrors trying to solve this changed many parameters. It is a time series problemI have data from 60 shops from 215 items in 1034 days. I have splitted 973 days for train and 61 for test:

train_x = train_x.reshape((60, 973, 215))
test_x = test_x.reshape((60, 61, 215))
train_y = train_y.reshape((60, 973, 215))
test_y = test_y.reshape((60, 61, 215))

My model:

model = Sequential()
model.add(LSTM(100, input_shape=(train_x.shape[1], train_x.shape[2]), return_sequences='true'))
model.add(Dense(215))
model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy'])
history = model.fit(train_x, train_y, epochs=10,
                    validation_data=(test_x, test_y), verbose=2, shuffle=False)

ValueError: Error when checking input: expected lstm_1_input to have shape (973, 215) but got array with shape (61, 215)

Before and after an LSTM layer you need an input and output layer

model = Sequential() model.add(Dense(215, input_shape=(train_x.shape[1], train_x.shape[2]))) model.add(LSTM(100, return_sequences='true')) model.add(Dense(1, activation='softmax'))

ZisisFl commented 5 years ago

I tried:

model.add(Dense(215, input_shape=(train_x.shape[1], train_x.shape[2])))
model.add(LSTM(100, return_sequences='true'))
model.add(Dense(1, activation='softmax'))

and got: ValueError: Error when checking target: expected dense_2 to have shape (973, 1) but got array with shape (973, 215)

Then tried

model.add(Dense(215, input_shape=(train_x.shape[1], train_x.shape[2])))
model.add(LSTM(100, return_sequences='true'))
model.add(Dense(215, activation='softmax'))

and got ValueError: Error when checking input: expected dense_1_input to have shape (973, 215) but got array with shape (61, 215)

My goal is to have as output predictions over 215 items for 60 shops in 61 days something like 3660 x 215

fasecity commented 5 years ago

I tried:

model.add(Dense(215, input_shape=(train_x.shape[1], train_x.shape[2])))
model.add(LSTM(100, return_sequences='true'))
model.add(Dense(1, activation='softmax'))

and got: ValueError: Error when checking target: expected dense_2 to have shape (973, 1) but got array with shape (973, 215)

Then tried

model.add(Dense(215, input_shape=(train_x.shape[1], train_x.shape[2])))
model.add(LSTM(100, return_sequences='true'))
model.add(Dense(215, activation='softmax'))

and got ValueError: Error when checking input: expected dense_1_input to have shape (973, 215) but got array with shape (61, 215)

My goal is to have as output predictions over 215 items for 60 shops in 61 days something like 3660 x 215

I would try increasing the network:

model.add(Dense(1024, input_shape=(train_x.shape[1], train_x.shape[2]))) model.add(LSTM(256, return_sequences='true')) model.add(Dense(215, activation='softmax'))

cottrell commented 5 years ago

Did you get anywhere with this? I am stuck trying to get a multi input implementation working where the batch size is not shared across the inputs. Basically, two independent dynamic batch sizes. No luck so far.

Nafees-060 commented 4 years ago

My input is a CSV file and I made segments of about 400 samples. The feature is 3 (x, y,z). First, I applied CNN2D using model.add(Conv2D(16, (2, 2), activation = 'relu', input_shape = x_train[0].shape)). it perfactly worked, however in case of LSTM, input showed errors. So, I changed the input into model.add(LSTM(32, input_shape = (400,3), return_sequences=True)) then this code worked but below in model.fit I faced the problem. Please find the code and Error below: x_train.shape, x_test.shape output of above code: ((836, 400, 3), (209, 400, 3))

x_train = x_train.reshape(836, 400, 3, 1)   
x_test = x_test.reshape(209, 400, 3, 1)

x_train[0].shape  #output of this line: (400, 3, 1)

model = Sequential()     
model.add(LSTM(32, input_shape = (400,3), return_sequences=True))

model.add(Dropout(0.5)) 
model.add(Dense(100, activation='relu')) 
model.add(Flatten())
#Then Here we have Dense Layer 
model.add(Dense(64, activation= 'relu')) 
model.add(Dropout(0.5)) 
model.add(Dense(3, activation='softmax'))
model.compile(optimizer=Adam(learning_rate = 0.001), loss = 'sparse_categorical_crossentropy', metrics=['accuracy'])
history = model.fit(x_train, y_train, epochs = 10, validation_data = (x_test, y_test), verbose=1) 

ERROR

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-109-3ffd974b58e0> in <module>
      1 #Record this model tranning into a history
      2 
----> 3 history = model.fit(x_train, y_train, epochs = 10, validation_data = (x_test, y_test), verbose=1)
      4 #Below here you can see xthe training, here at the very first step 75% traning accuracy and 84% validation accuracy, After 10
      5 #epoc you see 91% of traning accuracy and 87% validaton accuracy, (As a complement, with accelrometer data, this is very good

c:\users\nafee\appdata\local\programs\python\python37\lib\site-packages\tensorflow_core\python\keras\engine\training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_freq, max_queue_size, workers, use_multiprocessing, **kwargs)
    726         max_queue_size=max_queue_size,
    727         workers=workers,
--> 728         use_multiprocessing=use_multiprocessing)
    729 
    730   def evaluate(self,

c:\users\nafee\appdata\local\programs\python\python37\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py in fit(self, model, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_freq, **kwargs)
    222           validation_data=validation_data,
    223           validation_steps=validation_steps,
--> 224           distribution_strategy=strategy)
    225 
    226       total_samples = _get_total_number_of_samples(training_data_adapter)

c:\users\nafee\appdata\local\programs\python\python37\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py in _process_training_inputs(model, x, y, batch_size, epochs, sample_weights, class_weights, steps_per_epoch, validation_split, validation_data, validation_steps, shuffle, distribution_strategy, max_queue_size, workers, use_multiprocessing)
    545         max_queue_size=max_queue_size,
    546         workers=workers,
--> 547         use_multiprocessing=use_multiprocessing)
    548     val_adapter = None
    549     if validation_data:

c:\users\nafee\appdata\local\programs\python\python37\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py in _process_inputs(model, x, y, batch_size, epochs, sample_weights, class_weights, shuffle, steps, distribution_strategy, max_queue_size, workers, use_multiprocessing)
    592         batch_size=batch_size,
    593         check_steps=False,
--> 594         steps=steps)
    595   adapter = adapter_cls(
    596       x,

c:\users\nafee\appdata\local\programs\python\python37\lib\site-packages\tensorflow_core\python\keras\engine\training.py in _standardize_user_data(self, x, y, sample_weight, class_weight, batch_size, check_steps, steps_name, steps, validation_split, shuffle, extract_tensors_from_dataset)
   2470           feed_input_shapes,
   2471           check_batch_axis=False,  # Don't enforce the batch size.
-> 2472           exception_prefix='input')
   2473 
   2474     # Get typespecs for the input data and sanitize it if necessary.

c:\users\nafee\appdata\local\programs\python\python37\lib\site-packages\tensorflow_core\python\keras\engine\training_utils.py in standardize_input_data(data, names, shapes, check_batch_axis, exception_prefix)
    563                            ': expected ' + names[i] + ' to have ' +
    564                            str(len(shape)) + ' dimensions, but got array '
--> 565                            'with shape ' + str(data_shape))
    566         if not check_batch_axis:
    567           data_shape = data_shape[1:]

ValueError: Error when checking input: expected lstm_16_input to have 3 dimensions, but got array with shape (836, 400, 3, 1)

Any idea to solve this problem?

ajay9022 commented 4 years ago

@hadisaadat . Some suggestions that might help provide some direction for you. I would suggest breaking up your problem into several pieces to make sure you understanding the input/output dimensions of each layer (reviewing this section will help https://keras.io/layers/recurrent/). Also, you have "batch_size" as the first argument of the LSTM call:

model.add(LSTM(batch_size, input_shape=(Max_word_len, Char_embedding_Size), return_sequences=False))

The first argument (which is called "units" in the documentation) is the output dimensionality of that layer. In order words, you will have "units" number of LSTM cells at that layer. In Keras, the batch dimension is not typically specified in the model architecture.

I think unit is not the no. of LSTM cells in that layer, rather it the the dimension of the hidden vectors present inside each LSTM cell.

I had this model:

model3 = Sequential()
model3.add(Embedding(input_dim = vocab_size, output_dim = EMBEDDING_DIM, weights=[embedding_matrix], input_length=X.shape[1], trainable = False))
model3.add(SpatialDropout1D(0.2))
model3.add(Bidirectional(LSTM(64, dropout=0.2, recurrent_dropout=0.2, return_sequences=True)))
#model3.add(Bidirectional(LSTM(64, dropout=0.2, recurrent_dropout=0.2, return_sequences=True)))
model3.add(Bidirectional(LSTM(128, dropout=0.2, recurrent_dropout=0.2)))
model3.add(Dense(40, activation='softmax'))
model3.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

epochs = 5
batch_size = 64

history3 = model3.fit(X_train, Y_train, epochs=epochs, batch_size=batch_size,validation_split=0.1)    

and model3.summary() gives me

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding_3 (Embedding)      (None, 50, 100)           2757300   
_________________________________________________________________
spatial_dropout1d_3 (Spatial (None, 50, 100)           0         
_________________________________________________________________
bidirectional_5 (Bidirection (None, 50, 128)           84480     
_________________________________________________________________
bidirectional_6 (Bidirection (None, 256)               263168    
_________________________________________________________________
dense_1 (Dense)              (None, 40)                10280     
=================================================================
Total params: 3,115,228
Trainable params: 357,928
Non-trainable params: 2,757,300
_________________________________________________________________

Here, first Bi-LSTM has 64 units but I can nowhere see 64 in the output dimensions.

adam-grant-hendry commented 4 years ago

Why is this still open? Is there no solution?

Romyull-Islam commented 4 years ago

same thing happens when I wrote the following for the first LSTM layer:

model.add(LSTM(150,
               input_shape=trainX.shape,
               return_sequences=False))

I have the same issue tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(64,input_shape=X.shape, return_sequences=True)),

but received the same error, Input 0 of layer sequential_15 is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: [1, 6]

jpandeinge commented 3 years ago

I have the same issue

# create and fit the LSTM network

model = Sequential() model.add(LSTM(3, input_shape=(35753, 6), return_sequences=True)) model.add(Dense(1)) model.compile(loss='mean_squared_error', optimizer='adam') model.fit(x, y, epochs=10, batch_size=1, verbose=2)

and got an error of,

ValueError: Input 0 of layer sequential_23 is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: [1, 6]

My x shape is (35753, 6) and My y shape is (35753, 1)

jpandeinge commented 3 years ago

Guys, return sequences whenever u r using a second LSTM layer on top of the other. Otherwise, if LSTM is last, no need to set return_seq to true

I changed the return sequences to False but still experiencing the same problem

epx-dan commented 3 years ago

I was having error:

# error: Input 0 is incompatible with layer gru_1: expected ndim=3, found ndim=4
# This is my first GRU layer:
model.add(GRU(units=256, activation='relu', dropout=0.3, return_sequences=True, input_shape=input_shape))

Then, I changed the input_shape argument as the following:

model.add(GRU(units=256, activation='relu', dropout=0.3, return_sequences=True, input_shape=input_shape[1:]))

Now, the error disappeared.

Thus, just as @td2014 said, make sure: (1) omit batch dimension which is batch_size in input_shape argument; (2) set return_sequences=True in any RNN layer where the new output is feed into the next RNN layer.

MikaManurung commented 3 years ago

has this issue been closed? I have the same error I am trying for named entity recognition and here are the details of data train and data test : Shape X_train: (3555, 120, 1024) Shape X_test: (887, 120, 1024) Shape y_train: (3555,) Shape y_test: (887,)

input = Input(shape=(120,))
word_embedding_size = 1024
model = Embedding(input_dim=n_words, output_dim=word_embedding_size, input_length=120)(input)
model = Bidirectional(LSTM(units=word_embedding_size, 
                           return_sequences=True, 
                           dropout=0.5, 
                           recurrent_dropout=0.5, 
                           kernel_initializer=k.initializers.he_normal()))(model)
model = LSTM(units=word_embedding_size * 2, 
             return_sequences=True, 
             dropout=0.5, 
             recurrent_dropout=0.5, 
             kernel_initializer=k.initializers.he_normal())(model)
model = TimeDistributed(Dense(n_tags, activation="relu"))(model)  # previously softmax output layer

crf = CRF(n_tags)  # CRF layer
out = crf(model)  # output
model = Model(input, out)

adam = k.optimizers.Adam(lr=0.0005, beta_1=0.9, beta_2=0.999)
model.compile(optimizer=adam, loss=crf.loss_function, metrics=[crf.accuracy, 'accuracy'])

model.summary()

model.fit(X_train , y_train, validation_data=(X_test, y_test), epochs=10, batch_size=32)

The error is : ValueError: Input 0 of layer bidirectional_14 is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: (None, 120, 1024, 1024)

Please help me out, I am not able to solve it through other answers.

kty9912 commented 2 years ago

Everyone who came here because of this problem! Your problem is probably a code ordering problem in sequence.reshape(). I solved this problem by switching sequence.reshape() and sequence.shape(). I want you to look at the code again. Good luck!