SHI-Labs / Neighborhood-Attention-Transformer

Neighborhood Attention Transformer, arxiv 2022 / CVPR 2023. Dilated Neighborhood Attention Transformer, arxiv 2022
MIT License
1.05k stars 86 forks source link

Is there an efficient way to apply NAtten to 1D data? #30

Closed Guaishou74851 closed 2 years ago

Guaishou74851 commented 2 years ago

Hi, your work is really inspiring and interesting! I am wondering if there is a simple way to apply your Natten block to 1D data with a shape of (B, C, L), where L is the sequence length?

Guaishou74851 commented 2 years ago

I make a copy of NAtten folder, but I have no idea about adjusting the source codes, for making the block act like Conv1d, with a 1D kernel (e.g. 49*1 kernel)?

alihassanijr commented 2 years ago

Hello and thank you for your interest,

We plan to release a complete torch extension in the near future with support for both single and multi-dimensional data, as well as other relevant properties. In the case of 1D data, it is actually relatively easy to do, so we will release a version of that in the coming days.

I'll keep this issue open until then, so you get notified when we merge the PR.

alihassanijr commented 2 years ago

Actually I just went ahead and did it, considering how simple it was. Now you can just import NeighborhoodAttention1d from natten which will work on 4D tensors (Batch, Head, Length, Dim) instead of 5D.

I'll keep this open in case you have any issues, but feel free to close it if it works out.

Guaishou74851 commented 2 years ago

It works well in my program, thanks for your so timely responses and efforts~