tensorflow / tensor2tensor

Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.
Apache License 2.0
15.11k stars 3.44k forks source link

Use of Layer Normalization #1910

Open Andre1998Shuvam opened 2 years ago

Andre1998Shuvam commented 2 years ago

Hello! I would like to know why Layer Normalization and Residual connections have been used in the Transformer architecture.