xinghaochen / SLAB

[ICML 2024] Official PyTorch implementation of "SLAB: Efficient Transformers with Simplified Linear Attention and Progressive Re-parameterized Batch Normalization"
77 stars 6 forks source link

About RepBN #7

Open leily578 opened 2 months ago

leily578 commented 2 months ago

Thank you for the great work! Why do you add shortcut to BN in RepBN? Similar to the explanation in RepVGG, is it to construct a multi-branch architecture to make the model an implicit ensemble of numerous shallower models and enrich the model's expressive power?

guojialong1 commented 2 months ago

It is mainly based on the consideration of enriching the model's expressive power and enabling BatchNorm to be skipped.

leily578 commented 2 months ago

It is mainly based on the consideration of enriching the model's expressive power and enabling BatchNorm to be skipped.

Thanks for your reply!