Closed Rudeguy1 closed 1 year ago
Hello @Rudeguy1 Thanks for taking a look at our work and for your appreciation. For our work, we followed the same setting as that of ECANet and SENets, thus, we add TA after each block. There is no intuitive harm in adding at just the last level but it seems rather more obvious to me that adding at every level will provide stronger results at an additional overhead compute cost but this can be easily experimentally validated.
Thank the author for sharing your work, which is very helpful to me.
I have a little question after studying your work
Is your attention level added after each block? Is there a theoretical difference between this and only adding at the last level?