mobiusml / hqq

Official implementation of Half-Quadratic Quantization (HQQ)
https://mobiusml.github.io/hqq_blog/
Apache License 2.0
697 stars 68 forks source link

Hqq vs gguf #118

Closed blap closed 1 week ago

blap commented 1 month ago

Is there an easy way to convert gguf to hqq and vice-versa? Any comparisons? https://github.com/leafspark/AutoGGUF

mobicham commented 1 month ago

Hi! What of quantization is GGUF using? If it's asymmetric quantization (with both scales/zeros) it could be converted

blap commented 1 month ago

Hi! What of quantization is GGUF using? If it's asymmetric quantization (with both scales/zeros) it could be converted

Sorry. I don't know the specs, but here you can see details about it and how to convert hf to gguf from llama.cpp: https://github.com/ggerganov/llama.cpp/tree/master/gguf-py

mobicham commented 1 month ago

Thanks for sharing, looks like the logic is quite different, so I don't think both quantized outputs are compatible unfortunately.