kyegomez / FastFF

Zeta implementation of a reusable and plug in and play feedforward from the paper "Exponentially Faster Language Modeling"
MIT License
15 stars 0 forks source link

Incorrect Implementation #14

Open ThomasPluck opened 4 months ago

ThomasPluck commented 4 months ago
  1. Weight matrices should be 2 * depth - 1 not 2 depth -1
  2. ChatGPT clamping fix in CMM isn't unnecessary due to this correction
  3. Einsum usage is very inefficient both in time and space for FFF

Upvote & Fund

Fund with Polar

github-actions[bot] commented 4 months ago

Hello there, thank you for opening an Issue ! 🙏🏻 The team was notified and they will get back to you asap.

github-actions[bot] commented 2 months ago

Stale issue message

kyegomez commented 2 months ago

@ThomasPluck can you submit a pr pls