thushv89 / attention_keras

Keras Layer implementation of Attention for Sequential models
https://towardsdatascience.com/light-on-math-ml-attention-with-keras-dc8dbc1fad39
MIT License
443 stars 266 forks source link

TypeError: object of type 'Concatenate' has no len() #32

Closed sainikmahata closed 4 years ago

sainikmahata commented 4 years ago

I was trying to run a NMT model using GRU and the posted Attention Layer. But I am getting an issue citing "TypeError: object of type 'Concatenate' has no len()" . I am using sparse_categorical_crossentropy as my data is integer encoded and not one-hot encoded. I am very new to keras, and it would be helpful if you can point out my mistake. I am attaching the code snippet below. Another question is that, since I have written return_sequences=True in the case of decoder, do I need to use TimeDistributed Dense?

enc_inp=Input(shape=(286,)) enc_emb=Embedding(input_dim=91,output_dim=100)(enc_inp) enc_out,forward_h,backward_h=Bidirectional(GRU(64, return_sequences=True,return_state=True))(enc_emb) state_h = Concatenate()([forward_h, backward_h]) encoder_states=[state_h]

dec_inp=Input(shape=(352,)) dec_emb=Embedding(input_dim=117,output_dim=100)(dec_inp) decout,=GRU(128,return_sequences=True,return_state=True)(dec_emb,initial_state=encoder_states)

attn_layer = AttentionLayer(name='attention_layer') attn_out, attn_states = attn_layer([enc_out, dec_out])

decoder_concat_input = Concatenate(axis=-1, name='concat_layer')([dec_out, attn_out])

dec_dense=TimeDistributed(Dense(117, activation='softmax'))(decoder_concat_input)

model=Model([enc_inp,dec_inp],dec_dense) model.compile(optimizer='nadam', loss='sparse_categorical_crossentropy', metrics=["sparse_categorical_accuracy"]) model.fit([en_inp,fr_inp],fr_out, batch_size=512, epochs=25, validation_split=0.1) model.save('model.h5')`

thushv89 commented 4 years ago

@sainikmahata ,

Please post the full stack trace.

sainikmahata commented 4 years ago

@sainikmahata ,

Please post the full stack trace.

thanks for the reply. I have solved the issue. Just rewrote my code in tensorflow 2.0.

sainikmahata commented 4 years ago

Solved