-
-
Thank you for providing the code.
I have a question about Fig.5 in your paper.
In my understanding, the value of attention is obtained for each pair of (s,e,t) based on Eq.(3).
In Fig.5, you se…
-
Hey there
I'm interested in Char2Wav, thanks for your code.
Would u update it with attention mechanism?
-
In the tutorial, I find that the "attention" mechanism is a fake attention, since the calculated attention weights have no relationship with the encoder output vectors.
The implementation in the orig…
-
### Model description
"Attention Is All You Need" is a landmark 2017 research paper authored by eight scientists working at Google, responsible for expanding 2014 attention mechanisms proposed by Bah…
-
Hi,
Your attention mechanism is quite slow. Since you compute the linear projections (aw and bw) each time although they do not change, the time is almost quadratic.
I have implemented a faster versi…
-
Flash Attention can only be used with fp16 and bf16, not with fp32. Therefore, we should make flash attention optional in our codebase so that one can deactivate it during inference in exchange for hi…
-
Very good idea!in your conclusion. such as further investigate incorporation of ECA with spatial attention module.I think the spatial attention mechanism is formed by the sliding of the convolution ke…
-
See project proposal [here](https://andre-martins.github.io/pages/project-examples-for-deep-structured-learning-fall-2019.html).
-
As far as I know, image caption can be generated without the mechanism of attention.
How can i remove the mechanism of attention effectively for training?
can u give me advice pls?