Closed kylesayrs closed 2 months ago
@Satrat Can you specify what you're looking for in a skip test?
@Satrat Can you specify what you're looking for in a skip test?
You could just initialize a module with some modules skipped (more than the lm_head) and others quantized, then search the logs for the debug string, or just testing your getattr_chain helper function directly on the model would be fine too
Yeah the failing base test is because of a bug from the previous release which I fixed in the main branch See: https://github.com/neuralmagic/compressed-tensors/blame/4b214e582c8434733efea79239cfadec9358b7fb/src/compressed_tensors/quantization/observers/base.py#L165-L167
Using my local machine and the main branch of compressed_tensors, I confirmed that the tests/llmcompressor/modifiers/
and tests/llmcompressor/transformers/compression/
are passing
Purpose
Changes
freeze_quantization
is True (default), even if QuantizationModifier is wrapped by GPTQModifierget_attr_chain
helper function to be used for getting chained attributesget_attr_chain
to get weight quantization arguments and skip computation if weight does not have valid argsTesting
Regression tested saving, loading, and vllm inferencing with group quantized model