lsdefine / attention-is-all-you-need-keras

A Keras+TensorFlow Implementation of the Transformer: Attention Is All You Need
708 stars 188 forks source link

the mask of attention #28

Open zjjzyl opened 5 years ago

zjjzyl commented 5 years ago

in transformer.py, line 87, mask = Lambda(lambda x:K.repeat_elements(x, n_head, 0))(mask) this line makes the mask shape (in readout_model) like (batch_sizen_head,x,x), but the shape of the result of reshape1 like (n_headbatch_size,x,x), it seems the same shape, but the elements not.

Maybe the repeat_elements could change to tile?