IST-DASLab / gptq

Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".
https://arxiv.org/abs/2210.17323
Apache License 2.0
1.81k stars 145 forks source link

Can --save work with --groupsize in opt.py? #22

Closed Frozenmad closed 1 year ago

Frozenmad commented 1 year ago

Hello there, nice work!

If I understand well, when groupsize is set above 0, the quantizer in module gptq of opt is only responsible for each group. The opt_pack3 is counting on the quantizer.pack function, which has only the zeros and scales of the last group if groupsize is set above 0.

So can --save and --groupsize work together in opt.py right now?

efrantar commented 1 year ago

Hi, saving group-quantized models is currently not supported in our code, which was primarily designed for research purposes (in particular on the quantization process itself) and is more meant as a reference implementation / proof of concept. In the past few months, there have been several other follow-up projects on GitHub which implement some of the missing features when it comes to model exporting and quantized execution, e.g. GPTQ-for-LLaMA.