meta-llama / llama

Inference code for Llama models
Other
56.51k stars 9.58k forks source link

Function does not implement RMSNorm #1193

Closed JW-swansea closed 3 weeks ago

JW-swansea commented 3 weeks ago

Hi, I was looking through the code and noticed something strange.

This function, is supposed to implement RMSNorm, from Zhang, Biao, and Rico Sennrich. "Root mean square layer normalization." Advances in Neural Information Processing Systems 32 (2019).

But instead of dividing by the appropriate coefficient, it multiplies.

https://github.com/meta-llama/llama/blob/8fac8befd776bc03242fe7bc2236cdb41b6c609c/llama/model.py#L52-L63

If the square of entries of the vector is already n, this makes no difference, but if it is anything else, it will make larger vectors larger and smaller vectors smaller, away from that value, opposite to intended functionality.

JW-swansea commented 3 weeks ago

Issue also reposted here in the current repository, close as appropriate.

JW-swansea commented 3 weeks ago

Problem solved, misread rsqrt as sqrt