datalogue / keras-attention

Visualizing RNNs using the attention mechanism
https://medium.com/datalogue/attention-in-keras-1892773a4f22
GNU Affero General Public License v3.0
747 stars 243 forks source link

How do I pass the output of AttentionDecoder to an RNN layer. #52

Open BigWheel92 opened 4 years ago

BigWheel92 commented 4 years ago

I am trying to the pass the decoder output to another layer of rnn. However it gives me the error. #TypeError: float() argument must be a string or a number, not 'Dimension'

x_in= Input(shape=(x_train.shape[1], x_train.shape[2]), name='x_in')

meta_in= Input(shape=(x_meta_train.shape[1], x_meta_train.shape[2]), name='meta_in')

x=Bidirectional(LSTM(100, input_shape=(x_train.shape[1], x_train.shape[2]), activation='tanh', return_sequences=True))(x_in)

y=LSTM(100, input_shape=(x_meta_train.shape[1], x_meta_train.shape[2]), activation='tanh', return_sequences=True)(meta_in)

x_=AttentionDecoder(50, x.shape[2], name='AD1')(x)

y_= AttentionDecoder(50, y.shape[2],name='AD2')(y)

x__=Bidirectional(LSTM(20, inputshape=(50, x.shape[2].value), activation='tanh', returnsequences=True))(x) #TypeError: float() argument must be a string or a number, not 'Dimension'

y__=Bidirectional(LSTM(20, inputshape=(50, y.shape[2].value), activation='tanh', returnsequences=True))(y)

user06039 commented 4 years ago

@BigWheel92 Have you implemented machine translation with attention using AttentionDecoder? If so can you please provide a small tutorial code. I'm trying to learn seq2seq model but can't understand how to implement and make prediction using this attention decoder? If you have done it can you help me out a little bit.

BigWheel92 commented 4 years ago

@John-8704, I used SeqSelfAttention available in keras_self_attention library.

user06039 commented 4 years ago

@BigWheel92 It's confusion on how to implement it in my architecture. If you have done any machine translation seq2seq model like english - french translation. Can you please provide or share your work. I really like to know how to implement it with attention and make inference with it. I couldn't find any guide online.

BigWheel92 commented 4 years ago

Unfortunately, I haven't implemented seq-to-seq architecture. The following link may help you understand how to use attention in seq-to-seq models. www.tensorflow.org/tutorials/text/nmt_with_attention