VITA-Group / Q-GaLore

Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.
Apache License 2.0
175 stars 13 forks source link

HF Transformers #8

Open GeraudBourdin opened 2 months ago

GeraudBourdin commented 2 months ago

Hello,

Thanks for this promising optimizer.

I've seen this PR https://github.com/huggingface/transformers/pull/31936 but still not merged.

In the meantime is there any way to use Q-GaLore with hf transformers like explained in this post https://github.com/huggingface/transformers/issues/32225#issuecomment-2269932414 ?

That would be great to have a sample code for this.

Thanks !