Jongchan / attention-module

Official PyTorch code for "BAM: Bottleneck Attention Module (BMVC2018)" and "CBAM: Convolutional Block Attention Module (ECCV2018)"
MIT License
2.06k stars 402 forks source link

Question: Position of BN in BAM. #49

Open Ujjawal-K-Panchal opened 1 year ago

Ujjawal-K-Panchal commented 1 year ago

I think there is possibly an inconsistency in the location of the application of the Batch Normalization between the paper and the code.

1st instance of possible inconsistency:

The paper mentions on Page 4, Section 3, subsection Channel attention branch that:

image

So it is understod that the batch norm is applied at the end of the MLP, i.e. after the final layer. However, from the implementation, shows final layer in forward() function as: https://github.com/Jongchan/attention-module/blob/459efad0e05ee7dde50c41ca10a3d0800bc3792a/MODELS/bam.py#L25

Where self.gate_c's final layer is defined as: https://github.com/Jongchan/attention-module/blob/459efad0e05ee7dde50c41ca10a3d0800bc3792a/MODELS/bam.py#L22

Ujjawal-K-Panchal commented 1 year ago

2nd instance of possible inconsistency:

The paper mentions on Page 5, Section 3, subsection Spatial attention branch that:

image

So it is understod that the batch norm is applied at the end of the spatial attention branch, i.e. after the final Linear layer. However, from the implementation, shows final layer in forward() function as:

https://github.com/Jongchan/attention-module/blob/459efad0e05ee7dde50c41ca10a3d0800bc3792a/MODELS/bam.py#L41

Where self.gate_s's final layer is defined as:

https://github.com/Jongchan/attention-module/blob/459efad0e05ee7dde50c41ca10a3d0800bc3792a/MODELS/bam.py#L39

Ujjawal-K-Panchal commented 1 year ago

One might suspect that the final BN layers can be added in the BAM() class. However, looking at the BAM() class:

https://github.com/Jongchan/attention-module/blob/459efad0e05ee7dde50c41ca10a3d0800bc3792a/MODELS/bam.py#L48

it becomes clear that the BN layer is not applied before the combination of $M_c \text{ and } M_s$. Is this an inconsistency?