EleutherAI / gpt-neox

An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries
https://www.eleuther.ai/
Apache License 2.0
6.96k stars 1.02k forks source link

Add Transformer Engine's version of RMSNorm and LayerNorm #1235

Closed lintangsutawika closed 2 months ago

lintangsutawika commented 5 months ago

Adds wrapper for running the Transformer Engine's version of RMSNorm and LayerNorm.

cc: @Quentin-Anthony

Quentin-Anthony commented 2 months ago

Broadly looks good to me.

TODO:

Quentin-Anthony commented 2 months ago

merged as part of merged as part of https://github.com/EleutherAI/gpt-neox/pull/1269