Closed rahul-tuli closed 6 months ago
Do we still need the ignore list if we have a targets list - would be great if we didn't need architecture specific ignores like LlamaRMSNorm
?
Side note: vLLMQuantizationModifier
is a dangerous name to keep around, I would prefer if we didn't keep this as a modifier
Do we still need the ignore list if we have a targets list - would be great if we didn't need architecture specific ignores like
LlamaRMSNorm
?Side note:
vLLMQuantizationModifier
is a dangerous name to keep around, I would prefer if we didn't keep this as a modifier
Yeah we can safely delete the ignore list, we only need to add a module to the ignore list if it would otherwise we covered by one of the config groups.
The vLLMQuantizationModifier vs regular QuantizationModifier is just to differentiate between the old and new quantization frameworks for now. We're going to get rid of the old framework soon, and at that point can rename the modifier. But if the name itself is an immediate problem sure we can change it
This PR enhances the user experience of the
GPTQModifier
by allowing it to directly accept quantization-related arguments, such asconfig_groups
. This change simplifies the configuration process, enabling users to specify a singleGPTQModifier
instead of combining both aQuantizationModifier
and aGPTQModifier
into a recipe.Key Changes
GPTQModifier
now accepts quantization-related arguments directly, facilitating easier and more direct configuration.Implementation Details
Under the hood, a
vLLMQuantizationModifier
is initialized with:config_groups
ignore
num_calibration_samples
disable_observer_epoch
Example Configurations
Old Configuration:
New Simplified Configuration:
End-to-End Script Example
Recipe:
Output
Command
STDOUT