Closed qsh-zh closed 2 years ago
Thank you for your interest. RPB is relative positional biases, they're learnable biases that are added to attention weights. They're widely used especially in restricted attention patterns, such as Swin's WSA, and we adopted that into NA.
You can refer to these works for more information. https://arxiv.org/abs/1711.11575 https://arxiv.org/abs/1904.11491 https://arxiv.org/abs/1910.10683 https://arxiv.org/abs/2002.12804
@alihassanijr thanks for your timely response.
Thanks for your awesome work!
Thanks for your awesome work.
Can you provide some clues for what is rpb and apply_pb, which does not appear in standard attention?
Thanks