Attention does not support masking

LincLabUCCS commented 5 years ago

Hello The parameter 'mask_zero = True ' however it raises an error that attention_layer does not support masking " Layer attention_layer1 does not support masking, but was passed an input_mask: Tensor("sequence_word_embeddings_3/NotEqual:0", shape=(?, 750), dtype=bool) "

is there a way to solve this?

Thank you for sharing the code

chaitjo commented 5 years ago

Sorry, I did not get around to adding testable code for this. Indeed, it is true that the Conv1D layer used for implementing the attention blocks does not support masking.

Here are some workarounds:

Set mask_zero=False in the embedding layer, so zeros are not treated as special mask values. We're hoping the model still learns to treat zero as a special case none-the-less.
Follow this discussion on stackoverflow and see if someone has an implementation of masked 1D convolutions: https://stackoverflow.com/questions/43392693/how-to-input-mask-value-to-convolution1d-layer
If you want a more powerful attention model for NLP tasks, look into the Transformer. You can find very nice open source implementations.

LincLabUCCS commented 5 years ago

Thank you chaitjo

chaitjo / structured-self-attention

Attention does not support masking #1