Closed TianhaoFu closed 3 years ago
Hi,
This is not a bug. In this case the attention module is called by TransformerEncoder
which accepts attn_mask
and length_mask
as optional inputs and appropriately calls the underlying attention. Please find the documentation here and for more details please take look at the forward pass of TransformerEncoder
.
I will close the issue. Feel free to open for more questions.
Hi, In fast-transformers/fast_transformers/attention/clustered_attention.py , I found that clusted attention forward function like this :
But in fast-transformers/fast_transformers/attention/attention_layer.py
It only pass a parameter x and don't pass 'attn_mask', 'query_lengths', and 'key_lengths'.
Is this a bug? Thanks