IST-DASLab / gptq

Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".
https://arxiv.org/abs/2210.17323
Apache License 2.0
1.81k stars 145 forks source link

Test on CNN model containing group conv by GPTQ method #56

Open xd1073321804 opened 2 months ago

xd1073321804 commented 2 months ago

Hi, for supportting CNN mode, I modified the GPTQ code as follows: 1, supportting group conv; 2, use symmetric quantization without zero point parameter.

But I found it performance not good on mobilenetv2/mnasnet1_0 models when quantization bits = 4. Here are my results: model | FP32 | GPTQ_W4 sym mbv2 71.88 60.84(84.64%) mnasnet1_0 73.47 64.71(88.08%) I saw resnet18/resnet50 quantization result in your paper only, have you tested gptq on mobilenetv2/mnasnet1_0 model?

Looking forward to your reply...