issues
search
microsoft
/
torchscale
Foundation Architecture for (M)LLMs
https://aka.ms/GeneralAI
MIT License
2.98k
stars
201
forks
source link
Update new RetNet settings
#69
Closed
sunyt32
closed
10 months ago
sunyt32
commented
10 months ago
Make the following modifications:
Replace LayerNorm with RMSNorm in both Sub-GroupNorm and Pre-LayerNorm;
Remove the bias in the Linear layers;
Replace FFN with SwiGLU, and re-allocate the parameters where the value_dim and ffn_dim are 5/3 * dim;
Make the following modifications: