Closed mikeseven closed 1 year ago
Hi @mikeseven Thank you for bringing this issue to our attention.
Can you please detail the accuracy results that you're getting when running resnet18, 50, and efficientnet_b0 with and without GPTQ. I want to make sure that we are getting similar results before I can identify the source of the issue.
@mikeseven It is possible that you got insufficient results with GPTQ because the quick start arguments for GPTQ didn't have compatible default values. We fixed this (see #785), so you might want to try and run GPTQ again.
As a general note - the GPTQ algorithm performs differently for different networks, and its parameters might need different calibration. If the results are still insufficient, try to change the learning rate (--gptq_lr
) or increase the number of GPTQ iterations (--gptq_num_calibration_iter
). Note that the latter would increase the runtime of the algorithm.
Dear @ofirgo,
Indeed, I realized your fix improved previous results on efficientnet_b0 by about 1.1%. However, the main contributor is the batch size in the evaluation of hessian. Using MSE for both activations and weights does provide a slight improvement, which suggests that the quantization configuration for PTQ and GPTQ may need to be different. I'll have to play with the parameters to figure out how to truly leverage it.
Thanks for your help.
@mikeseven Thank you for the informative feedback.
I'm closing the issue since it seems that you figured that out. Let us know if you need any more help.
Issue Type
Bug
Source
source
MCT Version
Main
OS Platform and Distribution
Ubuntu 20.04
Python version
3.11
Describe the issue
Expected behaviour
Code to reproduce the issue
Log output
No response