facebookresearch / schedule_free

Schedule-Free Optimization in PyTorch
Apache License 2.0
1.91k stars 65 forks source link

[FeatureRequest]Add AdEMAMixScheduleFree #46

Open sdbds opened 2 months ago

sdbds commented 2 months ago

image

code:https://github.com/nanowell/AdEMAMix-Optimizer-Pytorch

8bit version from bnb:https://github.com/bitsandbytes-foundation/bitsandbytes/blob/main/bitsandbytes/optim/ademamix.py

Tests have shown that AdEMAMix is better than AdamW and has little to no increase in memory.

adefazio commented 2 months ago

Cool! I'll look into this.

araleza commented 2 months ago

It's great that you're looking into this, @adefazio . Schedule-free Adam was strong, and now AdEMAMix is giving me great results too. If it turns out it's possible to combine their advantages, that would be amazing.

And233 commented 1 week ago

Cool! I'll look into this.

when will the AdEMAmixScheduleFree come true? can it be achieved by just connecting AdEMAmix with ScheduleFreeWrapper?