lsdefine / attention-is-all-you-need-keras

A Keras+TensorFlow Implementation of the Transformer: Attention Is All You Need
708 stars 188 forks source link

Issues with Keras Lambda Layers #3

Closed wfmonster closed 6 years ago

wfmonster commented 6 years ago

I'm running into a lot of errors attempting to run the Transformer.py file for testing purposes.

The issue begins with:

(100000, 7) (100000, 9)
X:  [[ 2 11 12 ...  7  4  3]
 [ 2 10 11 ... 12 12  3]
 [ 2  5  5 ... 13  6  3]
 ...
 [ 2 13 11 ... 12  5  3]
 [ 2  7 12 ...  6 11  3]
 [ 2  6  4 ...  7 13  3]]
Y:  [[ 2  4 20 ... 19 14  3]
 [ 2  4 20 ... 11  8  3]
 [ 2  4 20 ...  8  3  0]
 ...
 [ 2  4 20 ... 19  9  3]
 [ 2  4 20 ... 19  3  0]
 [ 2  4 20 ... 15  3  0]]
2018-07-10 18:46:13.676502: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
Traceback (most recent call last):
  File "transformer.py", line 586, in <module>
    s2s.compile('adam')
  File "transformer.py", line 396, in compile
    enc_output = self.encoder(src_seq, src_pos, active_layers=active_layers)
  File "transformer.py", line 306, in __call__
    mask = Lambda(lambda x:GetPadMask(emb, emb))(src_seq)
  File "/Users/user/anaconda2/envs/tfdeeplearning/lib/python3.6/site-packages/keras/engine/base_layer.py", line 460, in __call__
    output = self.call(inputs, **kwargs)
  File "/Users/user/anaconda2/envs/tfdeeplearning/lib/python3.6/site-packages/keras/layers/core.py", line 693, in call
    return self.function(inputs, **arguments)
  File "transformer.py", line 306, in <lambda>
    mask = Lambda(lambda x:GetPadMask(emb, emb))(src_seq)
  File "transformer.py", line 255, in GetPadMask
    ones = K.expand_dims(K.ones_like(Q, 'float32'), -1)
AttributeError: 'Tensor' object has no attribute 'expand_dims'

What version of Keras and Tensorflow are you using to develop? Could you add that info to a requirements.txt file or possibly to the readme info. I am wondering if this is an issue between conflicting versions. I am using: tensorflow 1.8.0 Keras 2.2.0

I've tried wrapping the operations in Lambda Layers which works for the first two lines in GetPadMask Function but I am running into issues again with the K.batch_dot Operation. An Ideas? I am relatively new to the Keras framework.

lsdefine commented 6 years ago

Your code mask = Lambda(lambda x:GetPadMask(emb, emb))(src_seq) The wrapped function is not related with x (src_seq). It is strange. The last error is also strange because K is not a Tensor but a module. Is K the keras.backend in your code?

I dont know more without any other information.

wfmonster commented 6 years ago

Yes, K is the tf.keras.backend. I went ahead and cloned a new version of the repository, and the fresh code is running fine now. I must have made an error somewhere while working on modifications. Thank you for getting back to me.

I was also curious about the new QANET blocks you've added. Do you have more info on this or is it for personal research purposes (if you don't mind elaborating)? I currently do NMT research so I would be interested in any additional materials or thoughts.

lsdefine commented 6 years ago

The QANet blocks can be used for performing CNN+self-attention for encoding a sequence.

x = Embedding(words.num(), 64)(input)
x = Dropout(0.5)(x)
mask = Lambda(lambda x:K.cast(K.greater(x, 0), 'float32'))(input)
x = QANet_Encoder(64, n_head=4, n_conv=2, n_block=3, kernel_size=5, dropout=0.5, add_pos=False)(x, mask)

Moreover, they are basic elements for my implementation of the QANet model. I may release the implementation when I have time to organize the code.