Excuse me, why does the `SSA` module use multi-head attention and the `CFA` module does use single-head attention?

McGregorWwww / UDTransNet

This repo is the official implementation of 'Narrowing the semantic gaps in U-Net with learnable skip connections: The case of medical image segmentation' which is an improved journal version of UCTransNet.

MIT License

74 stars 8 forks source link

Excuse me, why does the `SSA` module use multi-head attention and the `CFA` module does use single-head attention? #17

Closed bang2003er closed 4 months ago

McGregorWwww commented 4 months ago

Hi, the use of single-head attention in CFA is to reduce computational cost, since we found that there's no major difference compared with multi-head attention. The underlying reason may due to the channel-wise characteristic. Hope this helps.