A little question about adding attention to the network

landskape-ai / triplet-attention

Official PyTorch Implementation for "Rotate to Attend: Convolutional Triplet Attention Module." [WACV 2021]

https://openaccess.thecvf.com/content/WACV2021/html/Misra_Rotate_to_Attend_Convolutional_Triplet_Attention_Module_WACV_2021_paper.html

MIT License

406 stars 49 forks source link

A little question about adding attention to the network #24

Closed Rudeguy1 closed 1 year ago

Rudeguy1 commented 1 year ago

Thank the author for sharing your work, which is very helpful to me.

I have a little question after studying your work

Is your attention level added after each block? Is there a theoretical difference between this and only adding at the last level?

digantamisra98 commented 1 year ago

Hello @Rudeguy1 Thanks for taking a look at our work and for your appreciation. For our work, we followed the same setting as that of ECANet and SENets, thus, we add TA after each block. There is no intuitive harm in adding at just the last level but it seems rather more obvious to me that adding at every level will provide stronger results at an additional overhead compute cost but this can be easily experimentally validated.