unslothai / unsloth

Finetune Llama 3.2, Mistral, Phi, Qwen 2.5 & Gemma LLMs 2-5x faster with 80% less memory
https://unsloth.ai
Apache License 2.0
18.4k stars 1.29k forks source link

feat: add option for using ADOPT optimizer based on Taniguchi, Shohei, et al. #1270

Open Selich opened 1 week ago

Selich commented 1 week ago

I am currently implementing the ADOPT into my work codebase and it would be nice if unsloth could use it also. Based on the paper, it does outperform Adam (but this time for real). It is based on Taniguchi, Shohei, et al. "ADOPT: Modified Adam Can Converge with Any Beta2 with the Optimal Rate." ArXiv, 2024, https://arxiv.org/abs/2411.02853

I can work on this in a few days once I get some free time.

Proposed Changes

danielhanchen commented 1 week ago

Oh this looks interesting! We always welcome new contributions! :)