Open Superklez opened 3 years ago
Is the input shape of MultiHeadAttention [batch_size, sequence_length, embedding_size]? Or is it the same as nn.MultiheadAttention where the input shape must be [sequence_length, batch_size, embedding_size]
MultiHeadAttention
nn.MultiheadAttention
Is the input shape of
MultiHeadAttention
[batch_size, sequence_length, embedding_size]? Or is it the same asnn.MultiheadAttention
where the input shape must be [sequence_length, batch_size, embedding_size]