Vahe1994 / AQLM

Official Pytorch repository for Extreme Compression of Large Language Models via Additive Quantization https://arxiv.org/pdf/2401.06118.pdf and PV-Tuning: Beyond Straight-Through Estimation for Extreme LLM Compression https://arxiv.org/abs/2405.14852
Apache License 2.0
1.18k stars 178 forks source link

[Feature Request] Gemma2 support & models #111

Closed kristaller486 closed 1 month ago

kristaller486 commented 5 months ago

Gemma2 is incredible model. With AQLM, it will fit into 12GB GPU.

Vahe1994 commented 4 months ago

Hello! Sorry for the late answer. It should fit .I'm planning to get to this soon. I'll keep you updated in this thread, but it may take weeks.

kristaller486 commented 4 months ago

Thank you for answer. I'll be waiting!

igvasilev commented 3 months ago

Will you add Gemma2 27B as well?

github-actions[bot] commented 2 months ago

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] commented 1 month ago

This issue was closed because it has been inactive for 14 days since being marked as stale.