shashankg7 / Keras-CNN-QA

Keras (re)implementation of paper "Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks. SIGIR, 2015"
68 stars 31 forks source link

Matrix size-incompatible #1

Open zhangzibin opened 7 years ago

zhangzibin commented 7 years ago

I got an exception:

Traceback (most recent call last): File "", line 132, in train_model(all_fname) File "", line 81, in train_model loss, acc = model.train_on_batch([x_trainq, x_traina], y_train1) File "/usr/local/lib/python2.7/dist-packages/keras/engine/", line 1226, in train_on_batch outputs = self.train_function(ins) File "/usr/local/lib/python2.7/dist-packages/keras/backend/", line 1096, in call updated = + [self.updates_op], feed_dict=feed_dict) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/", line 766, in run run_metadata_ptr) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/", line 964, in _run feed_dict_string, options, run_metadata) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/", line 1014, in _do_run target_list, options, run_metadata) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/", line 1034, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InvalidArgumentError: Matrix size-incompatible: In[0]: [50,250], In[1]: [201,201] [[Node: MatMul_1 = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false, _device="/job:localhost/replica:0/task:0/gpu:0"](cond_2/Merge, dense_2_W/read)]] [[Node: mul_5/_55 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device$incarnation=1, tensor_name="edge_1531_mul_5", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]

shashankg7 commented 7 years ago

Even I am facing some issues (not this one though) with the tensorflow backend. Can you try running it with the theano backend?

Miail commented 7 years ago

I have a similar issue.. Here is example code using cifar10 and VGG16 for replicating the error:

from keras.utils import np_utils

from keras import metrics
import keras
from keras import backend as K
from keras.layers import Conv1D,Conv2D,MaxPooling2D, MaxPooling1D, Reshape
from keras.models import Model
from keras.layers import Input, Dense
import tensorflow as tf
from keras.datasets import mnist,cifar10


batch_size = 128
num_classes = 10
epochs = 12

# input image dimensions
img_rows, img_cols = 32, 32

# the data, shuffled and split between train and test sets
(x_train, y_train), (x_test, y_test) = cifar10.load_data()

#print('x_train shape:', x_train.shape)
#print(x_train.shape[0], 'train samples')
#print(x_test.shape[0], 'test samples')

if K.image_data_format() == 'channels_first':
    x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
    x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
    input_shape = (1, img_rows, img_cols)
    x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 3)
    x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 3)
    input_shape = (img_rows, img_cols, 1)

x_train = x_train.astype('float32')
x_test = x_test.astype('float32')

x_train /= 255
x_test /= 255

print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')

y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

def fws():
    #print "Inside"
    #   Params:
    #   batch ,  lr, decay , momentum, epochs
    #Input shape: (batch_size,40,45,3)
    #output shape: (1,15,50)
    # number of unit in conv_feature_map = splitd
    input = Input(shape=(img_rows,img_cols,3))
    zero_padded_section = keras.layers.convolutional.ZeroPadding2D(padding=(96,96), data_format='channels_last')(input)
    print zero_padded_section
    model = keras.applications.vgg16.VGG16(include_top = True,
                    weights = 'imagenet',
                    input_shape = (224,224,3),
                    pooling = 'max',
                    classes = 1000)

    model_output = model(input)

    dense1 = Dense(units = 512, activation = 'relu',    name = "dense_1")(model_output)
    dense2 = Dense(units = 256, activation = 'relu',    name = "dense_2")(dense1)
    dense3 = Dense(units = 10 , activation = 'softmax', name = "dense_3")(dense2)

    model = Model(inputs = input , outputs = dense3)
    #sgd = SGD(lr=0.08,decay=0.025,momentum = 0.99,nesterov = True)
    model.compile(loss="categorical_crossentropy", optimizer='adam' , metrics = [metrics.categorical_accuracy])[:500], y_train[:500],
              validation_data=(x_test[:10], y_test[:10]))
    score = model.evaluate(x_test, y_test, verbose=0)
    print('Test loss:', score[0])
    print('Test accuracy:', score[1])

lbmili2018 commented 6 years ago

hello, can you tell me how to construct the pointwise's training instances to feed into CNN? such as the triple (Xi, Yij, Zij), but I don't know that how to send into CNN in practice.