SHI-Labs / Neighborhood-Attention-Transformer

Neighborhood Attention Transformer, arxiv 2022 / CVPR 2023. Dilated Neighborhood Attention Transformer, arxiv 2022
MIT License
1.06k stars 86 forks source link

Alternatives to Relative Positional Biases? #104

Open nikhilmishra000 opened 4 months ago

nikhilmishra000 commented 4 months ago

I see in the README it says:

There's just better alternatives that don't involve explicitly biasing the attention weight matrix, and they will be more performant on top of providing similar or better accuracy levels.

What alternatives do you recommend? It seems like the NAT repo is using relative positional biases?

AdityaKane2001 commented 4 months ago

Please refer to SHI-Labs/Neighborhood-Attention-Transformer#105.

alihassanijr commented 2 months ago

I moved this issue here since this issue is more related to NAT/DiNAT as opposed to NATTEN.

We'll be updating this thread soon.