cryscan / web-rwkv

Implementation of the RWKV language model in pure WebGPU/Rust.
Other
210 stars 15 forks source link

LayerNorm improvements #14

Closed FL33TW00D closed 6 months ago

FL33TW00D commented 6 months ago

https://fleetwood.dev/posts/layernorm-as-fast-as-possible

Your layernorm is subject to precision errors! Also you're missing eps.

cryscan commented 6 months ago

Thanks! One of my friends pointed out that I should use Welford's algorithm earlier in development but I was too busy then. Now that I have time, I will do a quick fix.