When specifying a scheme preset, the quantization modifier for GPTQ was not being properly initialized. In the example code below, despite specifying a W4A16 scheme the quantization config was always empty: Building quantization modifier with args: {'config_groups': {'config_group_0': QuantizationScheme(targets=['Linear'], weights=None, input_activations=None, output_activations=None)}}
The fix was to update the GPTQ modifier initialization to correctly apply the preset scheme. I've also added unit tests to confirm all variants of the GPTQ recipe are functioning as intended
https://github.com/neuralmagic/compressed-tensors/pull/81 must be merged first
When specifying a scheme preset, the quantization modifier for GPTQ was not being properly initialized. In the example code below, despite specifying a W4A16 scheme the quantization config was always empty:
Building quantization modifier with args: {'config_groups': {'config_group_0': QuantizationScheme(targets=['Linear'], weights=None, input_activations=None, output_activations=None)}}
The fix was to update the GPTQ modifier initialization to correctly apply the preset scheme. I've also added unit tests to confirm all variants of the GPTQ recipe are functioning as intended
Example Code