Open Ujjawal-K-Panchal opened 1 year ago
The paper mentions on Page 5, Section 3, subsection Spatial attention branch
that:
So it is understod that the batch norm is applied at the end of the spatial attention branch
, i.e. after the final Linear layer
. However, from the implementation, shows final layer in forward()
function as:
Where self.gate_s
's final layer is defined as:
One might suspect that the final BN layers can be added in the BAM()
class.
However, looking at the BAM()
class:
it becomes clear that the BN layer is not applied before the combination of $M_c \text{ and } M_s$. Is this an inconsistency?
I think there is possibly an inconsistency in the location of the application of the Batch Normalization between the paper and the code.
1st instance of possible inconsistency:
The paper mentions on
Page 4, Section 3, subsection Channel attention branch
that:So it is understod that the batch norm is applied at the end of the MLP, i.e. after the final layer. However, from the implementation, shows final layer in
forward()
function as: https://github.com/Jongchan/attention-module/blob/459efad0e05ee7dde50c41ca10a3d0800bc3792a/MODELS/bam.py#L25Where
self.gate_c
's final layer is defined as: https://github.com/Jongchan/attention-module/blob/459efad0e05ee7dde50c41ca10a3d0800bc3792a/MODELS/bam.py#L22