Closed abelmouhcine closed 5 years ago
Hello, Should the attention map be transposed? I can't see that in the papers! Also, I think you should use dim=0 in softmax.
Hello, Should the attention map be transposed? I can't see that in the papers! Also, I think you should use dim=0 in softmax.