SHI-Labs / Neighborhood-Attention-Transformer

Neighborhood Attention Transformer, arxiv 2022 / CVPR 2023. Dilated Neighborhood Attention Transformer, arxiv 2022
MIT License
1.05k stars 86 forks source link

abbreviation for rpb #58

Closed qsh-zh closed 2 years ago

qsh-zh commented 2 years ago

Thanks for your awesome work.

Can you provide some clues for what is rpb and apply_pb, which does not appear in standard attention?

Thanks

alihassanijr commented 2 years ago

Thank you for your interest. RPB is relative positional biases, they're learnable biases that are added to attention weights. They're widely used especially in restricted attention patterns, such as Swin's WSA, and we adopted that into NA.

You can refer to these works for more information. https://arxiv.org/abs/1711.11575 https://arxiv.org/abs/1904.11491 https://arxiv.org/abs/1910.10683 https://arxiv.org/abs/2002.12804

qsh-zh commented 2 years ago

@alihassanijr thanks for your timely response.

Thanks for your awesome work!