kyegomez / zeta

Build high-performance AI models with modular building blocks
https://zeta.apac.ai
Apache License 2.0
418 stars 41 forks source link

[BUG] RMSNorm Implementation #180

Closed 0seba closed 5 months ago

0seba commented 7 months ago

I think you should be dividing by the scale in the following line

https://github.com/kyegomez/zeta/blob/7dbb6a62f83413977a922d5fc6dec1b11f734bc3/zeta/nn/modules/rms_norm.py#L35

This this the scale definition

https://github.com/kyegomez/zeta/blob/7dbb6a62f83413977a922d5fc6dec1b11f734bc3/zeta/nn/modules/rms_norm.py#L29C9-L29C31

self.scale = dim**-0.5

And RMSNorm formula

image

Edit:

Also, I think the normalization should be in the dim -1, not -2

https://github.com/kyegomez/zeta/blob/7dbb6a62f83413977a922d5fc6dec1b11f734bc3/zeta/nn/modules/rms_norm.py#L34

Upvote & Fund

Fund with Polar

github-actions[bot] commented 7 months ago

Hello there, thank you for opening an Issue ! 🙏🏻 The team was notified and they will get back to you asap.

github-actions[bot] commented 5 months ago

Stale issue message