keras-team / keras

Deep Learning for humans
http://keras.io/
Apache License 2.0
61.57k stars 19.42k forks source link

Shape and dimension issues with vectors representing text #2668

Closed wailoktam closed 7 years ago

wailoktam commented 8 years ago

Hi, Can anyone help me with getting the shapes and dimension right when feeding vectors representing text to a CNN?

The input vectors are 127 matrices, each of dimension 100 (token number) x 100 (vector representing the token as generated by word2vec). They are stored in a 3d numpy array.

When doing the convolution layer, I first try:

Convolution2D(10, 3, 3, border_mode='same', input_shape=(100, 100))

which gives me:

Input 0 is incompatible with layer convolution2d_1: expected ndim=4, found ndim=3

The document tells me that 10 is the filter number and that from reading the tutorial on CNN:

http://cs231n.github.io/convolutional-networks/

2d convolution is possible for 2d input.

I consider it a historical problem due to cnn being first used with images, which has a channel dimension. I add 1 dimension to input shape and assign it the value 1 to get rid of the error message, without being certain that it is the right thing to do.

Convolution2D(10, 3, 3, border_mode='same', input_shape=(1, 100, 100))

But I cannot get very far.

I cannot add a softmax activation layer, getting this error message.

Exception: Cannot apply softmax to a tensor that is not 2D or 3D. Here, ndim=4

I need this activation layer to implement a cnn model mentioned in this paper:

http://arxiv.org/pdf/1508.01585v2.pdf

But for the time being, I skip it in order to proceed with debugging other parts of the program. Then another error comes up from the following:

model.fit([leftData, rightData], labels, nb_epoch=10, batch_size=32)

The error message reads: Wrong number of dimensions: expected 3, got 2 with shape (32, 1)

It feels like whether I add or remove a dimension, I cannot make it fit the expectation of Keras on video/image data.

Any advice would be appreciated. Thanks in advance.

Please make sure that the boxes below are checked before you submit your issue. Thank you!

https://github.com/mynlp/qa/commit/05ac682c91ee656e90f4ccd5ea85396ef13c45a3

joelthchao commented 8 years ago

First, your input need to have shape (batch_size, channels, rows, cols) in order to feed into Convolution2D layer, therefore it's necessary to add an additional dim and make your data with shape (127, 1, 100, 100). Second, if you understand what softmax does to your data, then you will understand why it only accept 2d or 3d data. You probably need to reshape your layer before feed into softmax Last, your label must mismatch with your output. Please post more details of your model and your loss.

lqj1990 commented 8 years ago

For the problem in your CNN layer, please think about the image. A RGB image usually have three channels, so the input for a 2D CNN should be (sample, channels, rows, cols). When you process text, there is just one channel, so you must reshape your data first to be (sample, 1, rows, cols). Just use Reshape((1,rows,cols)). For the problem in your fit step, please check the input of your model. 1. If you define Input(), make sure the dimension of real inputs correspond to the dimension you defined in Input layer. 2 If not, you can print model.input_shape to check the input dimension, then decide whether you need to revise your model or your data. I hope my experience could help you.

wailoktam commented 8 years ago

Hi, Thanks for you guys' helpful response. I will respond to joelthchao's post about the model building part first.

it's necessary to add an additional dim and make your data with shape (127, 1, 100, 100).

I get this in my mind but I am confused by the documentation, which says, in the section about 2d convolution layer:

When using this layer as the first layer in a model, provide the keyword argument input_shape (tuple of integers, does not include the sample axis), e.g. input_shape=(3, 128, 128) for 128x128 RGB pictures.

and, in the section about the input shape of 2d convolution layer:

4D tensor with shape: (samples, channels, rows, cols) if dim_ordering='th' or 4D tensor with shape: > (samples, rows, cols, channels) if dim_ordering='tf'.

I dont get this point:

if you understand what softmax does to your data, then you will understand why it only accept 2d >or 3d data.

Isn't it just a way to normalize all the data, no matter in what order or dimension they are? I cannot see why it works on only on 2-3d, from the formula found in textbooks.

Below are the codes for building my model:

leftKerasModel = Sequential()
leftKerasModel.add(Reshape((1,100,100), input_shape=(100,100)))
leftKerasModel.add(Convolution2D(10, 3, 3, border_mode='same', input_shape=(127, 1, 100, 100)))
leftKerasModel.add(Activation('relu'))
leftKerasModel.add(MaxPooling2D(pool_size=(2, 2)))
rightKerasModel = Sequential()
rightKerasModel.add(Reshape((1,100,100), input_shape=(100,100)))
rightKerasModel.add(Convolution2D(10, 3, 3, border_mode='same', input_shape=(127, 1, 100, 100)))
rightKerasModel.add(Activation('relu'))
rightKerasModel.add(MaxPooling2D(pool_size=(2, 2)))
mergedKerasModel = Sequential()
merged = Merge([leftKerasModel, rightKerasModel], mode=lambda x: x[0]-x[1], output_shape=(10, 100,100))
mergedKerasModel.add(merged)
mergedKerasModel.add(Activation('softmax'))

The merge mode value is set just for testing. The original value, which is, a cosine similarity function, does not work. As it is a separate problem, so I remove it.

wailoktam commented 8 years ago

Hi, now comes the problem with fitting my data to the model. Thanks to lqj1990 for sharing your experience.

This is the shape attribute of the data I feed to the cnn:

('q3dArray shape:', (127, 100, 100)) ('qMatrix shape:', (1, 100, 100)) ('a3dArray shape:', (127, 100, 100)) ('aMatrix shape:', (1, 100, 100)) ('labels shape:', (127,))

qMatrix is one input sample for the left cnn. aMatrix is one input sample for the right cnn.

The error message I get:

Wrong number of dimensions: expected 3, got 2 with shape (32, 1)

from:

model.fit([leftData, rightData], labels, nb_epoch=10, batch_size=32)

is talking about the batch size (32) and sample number(1), rather than the rows and the columns. I don't know what is going on.

joelthchao commented 8 years ago

@wailoktam

d = Merge([leftKerasModel, rightKerasModel], mode=lambda x: x[0]-x[1], output_shape=(10, 100,100))

it should be (10, 50, 50), due to MaxPooling2D

model.fit([leftData, rightData], labels, nb_epoch=10, batch_size=32)

I still don't know what loss do you use in this task. At least, your output from the last layer should have same shape with labels.

wailoktam commented 8 years ago

HI, Joel, thanks. I am using a custom loss function:

model.compile(loss='custom_objective', optimizer=sgd)

the custom_objective function is defined in the keras library as:

def custom_objective(y_true, y_pred):
    if (y_true == 1):
        result = theano.tensor.maximum(0.09 - y_pred, 0.)
    else:
        result = theano.tensor.maximum(0.09 + y_pred, 0.)
    return result
joelthchao commented 8 years ago

According to the paper you mentioned, you will need to flatten representation from leftKerasModel and rightKerasModel, merge them with a cosine similarity layer and produce output with shape (1,). By the way, cosine similarity can be calculated by 1 - cosine_distance and cosine_distance can be calculated by Merge with mode=cos. Hope this can help you!

wailoktam commented 8 years ago

Thanks, Joel.

So I create a lambda layer after the merged layer that do the 1-output of the merged layer, right?

joelthchao commented 8 years ago

@wailoktam Exactly.

wailoktam commented 8 years ago

Hi, I change the activation function to tanh. (I still dont understand why only softmax is not allowed) Now I get this complaint fr Keras:

model.fit([leftData, rightData], labels, nb_epoch=10, batch_size=32)

ValueError: ('You cannot drop a non-broadcastable dimension.', ((False, False, False, False, False), (0, 1, 2, 4)))

Any idea is appreciated.

wailoktam commented 8 years ago

Hi, can anyone teach me how to use reshape or something else properly to get rid of the wrong dimension error?

My input data is of shape (127,100,100) where 127 is the number of samples I add a reshape line before the first layer:

kerasModel.add(Reshape((1,100,100), input_shape=(100,100)))

The first layer is define as a convolution 2d layer with input_shape (1,100,100)

kerasModel.add(Convolution2D(10, 3, 3, border_mode='same', input_shape=(1, 100, 100)))

I leave out the sample number when instantiating the input_shape value. I think this is the right way to do it although the documentation contradicts itself in different places about it.

joelthchao commented 8 years ago

You can remove input_shape in the second layer.

kerasModel.add(Reshape((1, 100, 100), input_shape=(100, 100)))
kerasModel.add(Convolution2D(10, 3, 3, border_mode='same'))
wailoktam commented 8 years ago

Hi, thanks for your suggestion. I have tried it. But still getting the error:

Wrong number of dimensions: expected 4, got 2 with shape (32, 1).')

where 32 is the batch size used when fitting the training data to the model.

I try reshaping the data, instead of creating a reshape layer by doing the following:

testData = numpy.reshape(testData, (127,1,100,100)).astype(theano.config.floatX)

It does not work either.

Can anyone post a tested example of reshaping? Thanks in advance.

joelthchao commented 8 years ago

@wailoktam Could you paste your code?

wailoktam commented 8 years ago

Hi,

This is the code for building the model:

sequential = Sequential()
sequential.add(Reshape((1, 100, 100), input_shape=(100, 100)))
sequential.add(Convolution2D(10, 3, 3, border_mode='same'))
sequential.add(Activation("relu"))

This is the code for training the model:

sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
km.compile(loss='hinge', optimizer=sgd)
km.fit(testData, labels, nb_epoch=1, batch_size=32)

I try remove adding the following line that reshape the data:

testData = numpy.reshape(testData, (127,1,100,100)).astype(theano.config.float)

and comment out the reshape layer. It does not work either.

Thanks.

joelthchao commented 8 years ago

According to this error message: Wrong number of dimensions: expected 4, got 2 with shape (32, 1).'). I guess your labels has shape (batch_size, 1), but your model has output with 4 dim.

Two things you will need to do if it is a classification task, one is Flatten the output, the other is making you label one-hot vector with shape (batch_size, class_num).

By the way, you should paste complete codes instead of some paragraphs, since error might happen due to any part of your codes. Try to allow others to reproduce your error.

wailoktam commented 8 years ago

Hi, thanks for looking at my code. I trim it down to the followings. They should produce the error I get. Thanks.

sequential = Sequential()
sequential.add(Reshape((1, 100, 100), input_shape=(100, 100)))
sequential.add(Convolution2D(10, 3, 3, border_mode='same'))
sequential.add(Activation("relu"))
test3dArray = numpy.random.random((127, 100,100))
testLabels = numpy.random.randint(2, size=127)
sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
sequential.compile(loss='hinge', optimizer=sgd)
sequential.fit(test3dArray, testLabels, nb_epoch=1, batch_size=32)
joelthchao commented 8 years ago

As I say in the last comment. Your label must be in the same shape of network's output.

from keras.utils import np_utils

sequential = Sequential()
sequential.add(Reshape((1, 100, 100), input_shape=(100, 100)))
sequential.add(Convolution2D(10, 3, 3, border_mode='same'))
sequential.add(Activation("relu"))
# here we have output with 4-dim shape

# flatten and add Dense to produce shape (batch_size, nb_classes)
# softmax is for classification task, change output into probability.
sequential.add(Flatten())
sequential.add(Dense(2))
sequential.add(Activation('softmax'))

test3dArray = np.random.random((127, 100, 100))
testLabels = np.random.randint(2, size=127)
# turn integer label into one-hot vector
testLabels = np_utils.to_categorical(testLabels, 2)

sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)

sequential.compile(loss='hinge', optimizer=sgd)
sequential.fit(test3dArray, testLabels, nb_epoch=1, batch_size=32)

You can use sequential.summary() to see what your network looks like.

wailoktam commented 8 years ago

HI, Joel, thanks for the helpful comment.

I have been following this tutorial:

http://cs231n.github.io/convolutional-networks/

I am trying to create the simplest convolutional model as described under the Layer Patterns section:

INPUT -> CONV -> RELU -> FC

I think FC(fully connected layer) = Dense. Are you suggesting that I cannot just put a Dense after a Relu layer in Keras? I must first flatten it such that the dimension matches, right?

joelthchao commented 8 years ago

I think FC(fully connected layer) = Dense.

Yes

I must first flatten it such that the dimension matches, right?

Yes, it's necessary to do that.

wailoktam commented 8 years ago

Hi, I now modify the test model to something closer to the real model I want to build.

leftKerasModel = Sequential()
leftKerasModel.add(Reshape((1,100,100),  input_shape=(100, 100)))
leftKerasModel.add(Convolution2D(10, 3, 3, border_mode='same'))
leftKerasModel.add(Activation('relu'))
leftKerasModel.add(MaxPooling2D(pool_size=(2, 2)))

rightKerasModel = Sequential()
rightKerasModel.add(Reshape((1,100,100), input_shape=(100,100)))
rightKerasModel.add(Convolution2D(10, 3, 3, border_mode='same'))
rightKerasModel.add(Activation('relu'))
rightKerasModel.add(MaxPooling2D(pool_size=(2, 2)))

mergedKerasModel = Sequential()
merged = Merge([leftKerasModel, rightKerasModel], mode=lambda x: x[0] - x[1], output_shape=(10,50,50))
mergedKerasModel.add(merged)
mergedKerasModel.add(Flatten())
mergedKerasModel.add(Dense(2))
mergedKerasModel.add(Activation('softmax'))

But Keras seems to be angry with the value set in the the output_shape of this line: merged = Merge([leftKerasModel, rightKerasModel], mode=lambda x: x[0] - x[1], output_shape=(10,50,50))

I get this error message:

ValueError: Shape mismatch: x has 25000 cols (and 32 rows) but y has 2500 rows (and 2 cols) Apply node that caused the error: Dot22(Reshape{2}.0, dense_1_W) Toposort index: 63 Inputs types: [TensorType(float32, matrix), TensorType(float32, matrix)] Inputs shapes: [(32, 25000), (2500, 2)] Inputs strides: [(100000, 4), (8, 4)] Inputs values: ['not shown', 'not shown'] Outputs clients: [[SoftmaxWithBias(Dot22.0, dense_1_b)]]

I think I am getting 10x50x50 after the pooling and merging does not change the dimension. What should I put in the RHS of output_shape?

The lambda operation is there for testing only. I get a puzzling error when doing a cosine there. So I want to try something simple first.

Thanks.

wailoktam commented 8 years ago

Hi, I change these lines: merged = Merge([leftKerasModel, rightKerasModel], mode=lambda x: x[0] - x[1], output_shape=(10,50,50)) mergedKerasModel.add(merged) mergedKerasModel.add(Flatten()) mergedKerasModel.add(Dense(2)) mergedKerasModel.add(Activation('softmax'))

to:

merged = Merge([leftKerasModel, rightKerasModel], mode= 'cos', output_shape=(1))
mergedKerasModel.add(merged)
mergedKerasModel.add(Activation('sigmoid'))

Now I am geting: ValueError: ('You cannot drop a non-broadcastable dimension.', ((False, False, False, False, False), (0, 1, 2, 4)))

What is wrong?

joelthchao commented 8 years ago

quick answer: output_shape=(1,), (1) is int and (1,) is tuple.

wailoktam commented 8 years ago

Hi, I cannot get this (using cosine to merge two layers) to work. Let me paste a runnable piece of code that do a concat when merging two layers (I will replace it with cosine, which I really want but have problems with):

test3dLArray = numpy.random.random((127, 100,100)) test3dRArray = numpy.random.random((127, 100,100)) testLabels = numpy.random.randint(2, size=127) testLabels = np_utils.to_categorical(testLabels, 2)

leftKerasModel = Sequential()
leftKerasModel.add(Reshape((1,100,100),  input_shape=(100, 100)))
leftKerasModel.add(Convolution2D(10, 3, 3, border_mode='same'))
leftKerasModel.add(Activation('relu'))
leftKerasModel.add(MaxPooling2D(pool_size=(2, 2)))

rightKerasModel = Sequential()
rightKerasModel.add(Reshape((1,100,100), input_shape=(100,100)))
rightKerasModel.add(Convolution2D(10, 3, 3, border_mode='same'))
rightKerasModel.add(Activation('relu'))
rightKerasModel.add(MaxPooling2D(pool_size=(2, 2)))

mergedKerasModel = Sequential()
merged = Merge([leftKerasModel, rightKerasModel], mode='concat')
mergedKerasModel.add(merged)
mergedKerasModel.add(Flatten())
mergedKerasModel.add(Dense(2))
mergedKerasModel.add(Activation('softmax'))

sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
mergedKerasModel.compile(loss='custom_objective', optimizer=sgd)
mergedKerasModel.fit([test3dLArray, test3dRArray], testLabels, nb_epoch=10, batch_size=32)

Now I revise it and incorporate some of the codes you provide in #2672. This is done by replacing the following:

 mergedKerasModel = Sequential()
 merged = Merge([leftKerasModel, rightKerasModel], mode='concat')
 mergedKerasModel.add(merged)
 mergedKerasModel.add(Flatten())
 mergedKerasModel.add(Dense(2))
 mergedKerasModel.add(Activation('softmax'))

with:

cos_distance = Merge([leftKerasModel, rightKerasModel], mode='cos', dot_axes=1)
cos_distance = Reshape((1,))(cos_distance)
cos_similarity = Lambda(lambda x: 1-x)(cos_distance)
mergedKerasModel = Model([leftKerasModel, rightKerasModel], [cos_similarity])

As a result, I get this error:

File "trainKeras.py", line 315, in km = make_test_network() File "trainKeras.py", line 134, in make_test_network cos_distance = Reshape((1,))(cos_distance) File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/Keras-1.0.2-py2.7.egg/keras/engine/topology.py", line 452, in call '". This layer has no information' Exception: You tried to call layer "reshape_3". This layer has no information about its expected input shape, and thus cannot be built. You can build it manually via: layer.build(batch_input_shape)

So I add the merged layer in another way:

mergedKerasModel = Sequential()
mergedKerasModel.add(Merge([leftKerasModel,rightKerasModel], mode='cos', dot_axes=1))
mergedKerasModel.add(Reshape((1,)))
mergedKerasModel.add(Lambda(lambda x: 1-x))

And I get a different error message:

File "trainKeras.py", line 322, in train_test_model(km,test3dLArray, test3dRArray, testLabels) File "trainKeras.py", line 158, in train_test_model km.fit([leftData, rightData], labels, nb_epoch=1, batch_size=32) File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/Keras-1.0.2-py2.7.egg/keras/models.py", line 409, in fit sample_weight=sample_weight) File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/Keras-1.0.2-py2.7.egg/keras/engine/training.py", line 1052, in fit callback_metrics=callback_metrics) File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/Keras-1.0.2-py2.7.egg/keras/engine/training.py", line 790, in _fit_loop outs = f(ins_batch) File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/Keras-1.0.2-py2.7.egg/keras/backend/theano_backend.py", line 518, in call return self.function(inputs) File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/Theano-0.8.2-py2.7.egg/theano/compile/function_module.py", line 871, in call storage_map=getattr(self.fn, 'storage_map', None)) File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/Theano-0.8.2-py2.7.egg/theano/gof/link.py", line 314, in raise_with_op reraise(exc_type, exc_value, exc_trace) File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/Theano-0.8.2-py2.7.egg/theano/compile/function_module.py", line 859, in call outputs = self.fn() ValueError: Input dimension mis-match. (input[1].shape[0] = 32, input[2].shape[0] = 200000000) Apply node that caused the error: Elemwise{Composite{(i0 - (i1 * i2))}}(TensorConstant{(1, 1) of 1.0}, lambda_1_target, Elemwise{Sub}[(0, 1)].0) Toposort index: 113 Inputs types: [TensorType(float32, (True, True)), TensorType(float32, matrix), TensorType(float32, col)] Inputs shapes: [(1, 1), (32, 2), (200000000, 1)] Inputs strides: [(4, 4), (8, 4), (4, 4)] Inputs values: [array([[ 1.]], dtype=float32), 'not shown', 'not shown'] Outputs clients: [[Elemwise{maximum,no_inplace}(Elemwise{Composite{(i0 - (i1 * i2))}}.0, TensorConstant{(1, 1) of 0.0}), Elemwise{Composite{((i0 * EQ(i1, i2) * i3 * i4 * i5) / i6)}}[(0, 1)](TensorConstant{%281, 1%29 of -1.0}, Elemwise{maximum,no_inplace}.0, Elemwise{Composite{%28i0 - %28i1 i2%29%29}}.0, InplaceDimShuffle{x,x}.0, InplaceDimShuffle{0,x}.0, lambda_1_target, Elemwise{Mul}[%280, 0%29].0)]]

Any suggestion? Thanks in advance.

stale[bot] commented 7 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs, but feel free to re-open it if needed.