Difference on SE from paper

xiaolai-sqlai / mobilenetv3

mobilenetv3 with pytorch，provide pre-train model

MIT License

1.6k stars 340 forks source link

Difference on SE from paper #15

Open triangleCZH opened 5 years ago

triangleCZH commented 5 years ago

1) I am just curious, why do you add batch normalization inside SeModule? Is there a reference to do so? 2) Please correct me if I make a mistake: I think SeModule should be added between dw and pw-linear, but your code seems to add that after pw-linear and right before residual connection 3) Do you think it's necessary to consider expand_ratio = 1? When expand_channel == output_channel, I feel that pw might be redundant, since the shape won't change a bit after pw.

Thank you!

ujsyehao commented 4 years ago

In efficientnet, there is NO BN in SE.
SE should be added between 3x3 dw and 1x1 pw.
it is redundant. You can refer to mobilenet v2.