ironjr / grokfast

Official repository for the paper "Grokfast: Accelerated Grokking by Amplifying Slow Gradients"
https://arxiv.org/abs/2405.20233
MIT License
476 stars 39 forks source link

Help to find hyper parameters for LLama 2 #7

Open 50Bytes-dev opened 2 months ago

50Bytes-dev commented 2 months ago

Hey, everybody. Can you tell me what hyper parameters to start trying?