Modalities / modalities

A framework for training multimodal foundation models.
MIT License
57 stars 5 forks source link

Feat: Implementation of RMSNorm #66

Closed le1nux closed 5 months ago

le1nux commented 6 months ago

RMSNorm is more compute-efficient in comparison to LayerNorm, as explained in the original paper: https://openreview.net/pdf?id=SygkZ3MTJE

Due to its benefits, RMSNorm replaces LayerNorm also in LLama 2 for instance.

The llama 2 implementation can be found here: https://github.com/facebookresearch/llama/blob/a0a4da8b497c566403941ceec47c2512ecf9dd20/llama/model.py#L34

flxst commented 5 months ago

Implemented in #67.