SHI-Labs / Neighborhood-Attention-Transformer

Neighborhood Attention Transformer, arxiv 2022 / CVPR 2023. Dilated Neighborhood Attention Transformer, arxiv 2022
MIT License
1.05k stars 86 forks source link

Questions about the algorithm speed. #13

Closed jikerWRN closed 2 years ago

jikerWRN commented 2 years ago

Hi, thanks for your good work. I notice that the paper does not compare the algorithm speed. I would like to know the speed comparison of NAT vs swin-Transformer, and CNN model. Thanks!

alihassanijr commented 2 years ago

Hello and thank you for your interest. We did not discuss that in this version of the paper, because the implementation (the CUDA kernel) is still a work in progress. Because both training and inference speeds are largely dependent on the implementation, it would not make sense to compare them at this stage. However, with the current version of the kernel that is publicly released, we found that NAT-Mini is faster in ImageNet classification inference than Swin-Tiny, and NAT-Tiny is as fast as Swin-Tiny. However, we found that NAT-Small and Base are slower than their Swin counterparts, due to the inefficiency still present in our kernel, which we are working on. Note that these speeds are all subject to input sizes, therefore with bigger models (more channels), or larger inputs (larger resolution), they will be different.

I hope this answers your question, but please let me know if you need further clarification.

jikerWRN commented 2 years ago

Ok, thanks for your reply!