ironjr / grokfast

Official repository for the paper "Grokfast: Accelerated Grokking by Amplifying Slow Gradients"
https://arxiv.org/abs/2405.20233
MIT License
476 stars 39 forks source link

Was trying to stick this code into Trainer's inner training loop. #6

Closed phalexo closed 2 months ago