apple / axlearn

An Extensible Deep Learning Library
Apache License 2.0
1.86k stars 259 forks source link

Support more types of groupnorm #772

Closed berlino closed 1 week ago

berlino commented 1 week ago

Previously, groupnorm assumes that layernorm is applied on all the input axes apart from batch and group axes. With this PR, we can choose to apply either RMSNorm or LayerNorm along configurable axes of input tensors.

berlino commented 1 week ago

@ruomingp do you know why the checks are still not finished yet?

ruomingp commented 1 week ago

@ruomingp do you know why the checks are still not finished yet?

I'm not sure. Maybe @markblee knows?