ironjr / grokfast

Official repository for the paper "Grokfast: Accelerated Grokking by Amplifying Slow Gradients"
https://arxiv.org/abs/2405.20233
MIT License
476 stars 39 forks source link

Any experiments / gotchas to be aware of when using schedulefree optimizer? #4

Open dawood95 opened 2 months ago

dawood95 commented 2 months ago

How will this interact with https://github.com/facebookresearch/schedule_free optimizer? Any gotchas to think about?