hpcaitech / ColossalAI

Making large AI models cheaper, faster and more accessible
https://www.colossalai.org
Apache License 2.0
38.74k stars 4.34k forks source link

[FEATURE]: Integrate GaLore into Colossalai Optimizer(Gemini/Hybrid) #5443

Closed ericxsun closed 1 month ago

ericxsun commented 7 months ago

Describe the feature

A recent paper titled "GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection" (https://arxiv.org/pdf/2403.03507.pdf) demonstrates a remarkable memory-efficient approach during the training of large language models (LLMs).

Can we integrate this memory-efficient technique into the Colossalai framework?

FYI

ericxsun commented 6 months ago

Any ColossalAI-er could take a look?

ver217 commented 6 months ago

Thanks! We will take a look.

ericxsun commented 6 months ago

I see the MR, that's awesome, when can we use it?

Edenzzzz commented 6 months ago

I plan to release it next week