sIncerass / powernorm

[ICML 2020] code for "PowerNorm: Rethinking Batch Normalization in Transformers" https://arxiv.org/abs/2003.07845
GNU General Public License v3.0
119 stars 17 forks source link

a question about the image of layer normalization in README.md #12

Closed erjiaxiao closed 2 years ago

erjiaxiao commented 2 years ago

the pic in README.md about LN is like 1 in my understanding, I guess maybe LN should cover a whole layer instead of just a line of a layer? am i wrong somewhere?

liyuke65535 commented 2 years ago

Have a look at "Leveraging Batch Normalization for Vision Transformers" It explains the differences between norm layers: bn ln