Closed Frozenmad closed 1 year ago
Hi, saving group-quantized models is currently not supported in our code, which was primarily designed for research purposes (in particular on the quantization process itself) and is more meant as a reference implementation / proof of concept. In the past few months, there have been several other follow-up projects on GitHub which implement some of the missing features when it comes to model exporting and quantized execution, e.g. GPTQ-for-LLaMA.
Hello there, nice work!
If I understand well, when
groupsize
is set above 0, the quantizer in module gptq ofopt
is only responsible for each group. Theopt_pack3
is counting on thequantizer.pack
function, which has only the zeros and scales of the last group ifgroupsize
is set above 0.So can
--save
and--groupsize
work together inopt.py
right now?