neuralmagic / sparseml

Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models
Apache License 2.0
2.07k stars 148 forks source link

Preserve sparsity GPTQ #2281

Closed rahul-tuli closed 6 months ago

rahul-tuli commented 6 months ago

Recently a bug was revealed, where if GPTQ modifier was applied consecutively after SparseGPT, the weight sparsity mask was not being respected, this PR fixes that by preserving the mask, we do this automatically if the weight sparsity is greater than SPARSITY_THRESHOLD which has been set to 5% for now.

Credits to @Satrat and @abhinavnmagic for proposing the fix

The unit test for consecutive application now runs w/o having to increase the relative tolerance which was done as a part of #2272