zyushun / Adam-mini

Code for Adam-mini: Use Fewer Learning Rates To Gain More https://arxiv.org/abs/2406.16793
198 stars 8 forks source link

Is there any plant to implement Quantized Adam-mini? #4

Open Kyeongpil opened 2 weeks ago

Kyeongpil commented 2 weeks ago

Hello, I greatly appreciate your excellent work.

As you discussed in the paper, I believe incorporating quantization methods like Adam8Bit could further enhance the results .

Do you have any plans to implement a quantized version of Adam-mini in the near future?

Thank you for your consideration.

zyushun commented 2 weeks ago

Thanks for your kind comments! We haven't planned to implement the quantized version of Adam-mini recently, but it is definitely an important direction to pursue and implement. :D

snapo commented 1 week ago

Did just want to ask the same, and Adam4Bit and a Adam8bit :-)

winglian commented 1 week ago

Perhaps it's worth pursuing if it's possible to extend the bitsandbytes base optimizer class so that we can get 8bit quantization as well as paged memory support as well? https://github.com/TimDettmers/bitsandbytes/blob/main/bitsandbytes/optim/optimizer.py

zyushun commented 1 week ago

Perhaps it's worth pursuing if it's possible to extend the bitsandbytes base optimizer class so that we can get 8bit quantization as well as paged memory support as well? https://github.com/TimDettmers/bitsandbytes/blob/main/bitsandbytes/optim/optimizer.py

Thanks for the great suggestion! We will work on it!