keras-team / keras

Deep Learning for humans
http://keras.io/
Apache License 2.0
62.14k stars 19.49k forks source link

Batch size problem with custom layer. #6070

Closed GauravBh1010tt closed 7 years ago

GauravBh1010tt commented 7 years ago

Hi, I am implementing Siamese LSTM as described in Siamese Recurrent Architectures. My custom layer function works for batch size 1 but it gives dimension mismatch error when batch size > 1. My code is

class Exp(Layer):
    def __init__(self, **kwargs):
        super(Exp, self).__init__(**kwargs)

    def call(self, x, mask=None):
        h1 = x[0]
        h2 = x[1]    
        dif = K.sum(K.abs(h1-h2),axis=1)
        h = K.exp(-dif)
        #print h.shape
        return h

    def get_output_shape_for(self, input_shape):
        #print 'input shape:',input_shape
        out_shape  = list(input_shape[0])
        out_shape[-1] = 1 
        #print 'output shape:',out_shape
        return tuple(out_shape)

def buildModel():
    inpx = Input(shape=(dimx,),dtype='int32',name='inpx')
    x = Embedding(output_dim=embedding_dim, input_dim=vocab_size, input_length=dimx)(inpx)
    inpy = Input(shape=(dimy,),dtype='int32',name='inpy')
    y = Embedding(output_dim=embedding_dim, input_dim=vocab_size, input_length=dimy)(inpy)

    hx = LSTM(LSTM_neurons,name='hx1')(x)
    hy = LSTM(LSTM_neurons,name='hy1')(y)

    h=Exp()([hx,hy])

    model = Model( [inpx,inpy],[h])
    model.compile( loss='mse',optimizer="rmsprop")
    model.fit([X_train_l,X_train_r], np.array(train_score),nb_epoch=5,
                 batch_size=4,verbose=1)

The error that I am getting is ValueError: GpuElemwise. Input dimension mis-match. Input 1 (indices start at 0) has shape[1] == 1, but the output's size on that axis is 4.

Inputs shapes: [(1, 4), (4, 1)]
Inputs strides: [(0, 1), (1, 0)]
Inputs values: [CudaNdarray([[ 0.32689139  0.38685739  0.16391446  0.213199  ]]), CudaNdarray([[ 0.45875001]
 [ 0.52499998]
 [ 0.55000001]
 [ 0.85000002]])]

This is the way Keras core layer are defined and they work pretty well for varied batch sizes. Any suggestions?

Thank you!

stale[bot] commented 7 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.

JustinhoCHN commented 6 years ago

have you solved that problem yet?

GauravBh1010tt commented 6 years ago

@JustinhoCHN - Yes the issue is resolved. Although I haven't uploaded the code for the exact same problem, this implementation of Neural Tensor Network is based on the general solution for varied batch sizes - NTN.

Hope it helps. Gaurav.