Aaronhuang-778 / BiLLM

(ICML 2024) BiLLM: Pushing the Limit of Post-Training Quantization for LLMs
https://arxiv.org/abs/2402.04291
MIT License
155 stars 12 forks source link

Do you quantize the LM head, embedding, and layernorms or just the weights? #4

Open tsengalb99 opened 4 months ago