pulp-platform / snitch_cluster

An energy-efficient RISC-V floating-point compute cluster.
https://pulp-platform.github.io/snitch_cluster/
Apache License 2.0
51 stars 51 forks source link

Optimized LayerNorm kernel #103

Closed viv-eth closed 8 months ago

viv-eth commented 8 months ago

This PR adds the FP32 LayerNorm kernel utilizing SSRs and FREP to improve performance.