It seems like RGA module is different from ordinary attention module, like the one used in ABD-Net.
In your article, the main advantage of RGA is that we can extract global relation information with less computational costs. However, spatial attention module and channel attention module in ABD-Net seem to be more efficient since there's no extra computational costs on feature embedding.
Could you tell me what's the advantage of RGA over ordinary attention module?
It seems like RGA module is different from ordinary attention module, like the one used in ABD-Net. In your article, the main advantage of RGA is that we can extract global relation information with less computational costs. However, spatial attention module and channel attention module in ABD-Net seem to be more efficient since there's no extra computational costs on feature embedding. Could you tell me what's the advantage of RGA over ordinary attention module?