Is Layer Normalization in TransformerEncoderLayer2D placed in the correct position?

Media-Smart / vedastr

A scene text recognition toolbox based on PyTorch

Apache License 2.0

535 stars 100 forks source link

Is Layer Normalization in TransformerEncoderLayer2D placed in the correct position? #46

Open S-aiueo32 opened 4 years ago

S-aiueo32 commented 4 years ago

Hello,

I found Layer Normalization is placed after self-attention and the residual connection in TransformerEncoderLayer2D. https://github.com/Media-Smart/vedastr/blob/d61776461d8a9c469d6875abd437de64b039a39e/vedastr/models/bodies/sequences/transformer/unit/encoder.py#L60-L63

I think it is a common way of basic transformers, however, the official implementation of SATRN places it before. Do you have any plans to align your code to the original one, or basis for reliability?

ChaseMonsterAway commented 4 years ago

Yes, we implement satrn before the code released. We will update the satrn in the future.

S-aiueo32 commented 4 years ago

Good! I will keep this issue open for future fix about it!

Johnson-yue commented 3 years ago

this bug is not fixed for now ？？？？

ChaseMonsterAway commented 3 years ago

this bug is not fixed for now ？？？？

We do some other jobs for now, i will fix it as soon as possible.

Johnson-yue commented 3 years ago

ok，thanks