Multi-input Text CNN using Graph

4fur4 commented 8 years ago

Hi all,

Im trying to use a multi-input version of the imdb_cnn

https://github.com/fchollet/keras/blob/master/examples/imdb_cnn.py

following the Graph example in the literature I have this

print('Build model...')
graph = Graph()

# Branch one
graph.add_input(name='input1', input_shape=(maxlen, ))
graph.add_node(Embedding(max_features, embedding_dims, input_length=maxlen), name='embedding1', input='input1')
graph.add_node(Convolution1D(nb_filter=nb_filter,
                             filter_length=filter_length,
                             border_mode="valid",
                             activation="relu",
                             subsample_length=1), name='conv1', input='embedding1')
graph.add_node(MaxPooling1D(pool_length=2), name='max1', input='conv1')
graph.add_node(Flatten(), name='flat1', input='max1')
graph.add_node(Dense(1), name='dense1', input='flat1')

# Branch two
graph.add_input(name='input2', input_shape=(maxlen, ))

graph.add_node(Embedding(max_features, embedding_dims, input_length=maxlen), name='embedding2', input='input2')
graph.add_node(Convolution1D(nb_filter=nb_filter,
                             filter_length=filter_length,
                             border_mode="valid",
                             activation="relu",
                             subsample_length=1), name='conv2', input='embedding2')
graph.add_node(MaxPooling1D(pool_length=2), name='max2', input='conv2')
graph.add_node(Flatten(), name='flat2', input='max2')
graph.add_node(Dense(1), name='dense2', input='flat2')

# Merge
graph.add_output(name='output', inputs=['dense1', 'dense2'], merge_mode='sum')
graph.compile(optimizer='rmsprop', loss={'output': 'mean_absolute_error'})

history = graph.fit({'input1': X_train, 'input2': X2_train, 'output': y_train}, nb_epoch=10)
predictions = graph.predict({'input1': X_test, 'input2': X2_test})

However, when running it I get the following error

graph.compile(optimizer='rmsprop', loss={'output': 'mean_absolute_error'}) File "build/bdist.linux-x86_64/egg/keras/models.py", line 1045, in compile File "build/bdist.linux-x86_64/egg/keras/layers/core.py", line 532, in get_output File "build/bdist.linux-x86_64/egg/keras/layers/core.py", line 962, in get_output File "build/bdist.linux-x86_64/egg/keras/layers/core.py", line 173, in get_input File "build/bdist.linux-x86_64/egg/keras/layers/core.py", line 834, in get_output File "build/bdist.linux-x86_64/egg/keras/layers/core.py", line 173, in get_input File "build/bdist.linux-x86_64/egg/keras/layers/convolutional.py", line 375, in get_output File "build/bdist.linux-x86_64/egg/keras/layers/core.py", line 173, in get_input File "build/bdist.linux-x86_64/egg/keras/layers/convolutional.py", line 140, in get_output File "build/bdist.linux-x86_64/egg/keras/layers/core.py", line 173, in get_input File "build/bdist.linux-x86_64/egg/keras/layers/embeddings.py", line 100, in get_output File "build/bdist.linux-x86_64/egg/keras/backend/theano_backend.py", line 130, in gather File "/usr/local/lib/python2.7/dist-packages/theano/sandbox/cuda/var.py", line 162, in getitem return _operators.getitem(self, _args) File "/usr/local/lib/python2.7/dist-packages/theano/tensor/var.py", line 502, in getitem return self.take(args[axis], axis) File "/usr/local/lib/python2.7/dist-packages/theano/tensor/var.py", line 534, in take return theano.tensor.subtensor.take(self, indices, axis, mode) File "/usr/local/lib/python2.7/dist-packages/theano/tensor/subtensor.py", line 2386, in take return take(a, indices.flatten(), axis, mode).reshape(shape, ndim) File "/usr/local/lib/python2.7/dist-packages/theano/tensor/subtensor.py", line 2364, in take return advanced_subtensor1(a, indices) File "/usr/local/lib/python2.7/dist-packages/theano/gof/op.py", line 600, in call node = self.make_node(_inputs, **kwargs) File "/usr/local/lib/python2.7/dist-packages/theano/tensor/subtensor.py", line 1687, in make_node raise TypeError('index must be integers') TypeError: index must be integers

I guess the Embedding layer doesnt like much what the input layer is giving to it.

The hyperparameters and the dimensions of X_train/X2_train, X_test/X2_test are the same as in the imdb_cnn.py class. In my case y_train, y_test contain float values (for regression) but using a sequential model with single input works fine so I guess this is not the issue.

Any ideas on how to solve this?

jarfo commented 8 years ago

Add dtype='int' to the arguments of add_input

4fur4 commented 8 years ago

Great, it worked. Thanks a lot. Shouldnt this be the default behaviour of the dtype for that method?

entron commented 8 years ago

What a coincidence! I just figured it out this exact problem. dtype='int' seems to be an unusual argument in keras. Why do we have this argument here?

keras-team / keras

Multi-input Text CNN using Graph #1493