neuralmagic / sparseml

Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models
Apache License 2.0
2.07k stars 148 forks source link

Update Quantization Logging to New Framework #2313

Closed Satrat closed 5 months ago

Satrat commented 5 months ago

The quantization percentage logging has been reporting 0.0% quantization since we switched over to the new framework. This PR updates the calculation to use the new framework. It also cleans up the string formatting and adds Embedding and a prunable/quantizable layer so the percentages are more accurate

Before

2024-05-30 16:04:21 sparseml.transformers.finetune.session_mixin INFO     There are 1034420224 prunable params which have 1.2609391906088643 avg sparsity.
2024-05-30 16:04:21 sparseml.transformers.finetune.session_mixin INFO     There are 1034420224 quantizable params, with a quantization percentage of 0.0.

After

2024-05-30 15:57:12 sparseml.transformers.finetune.session_mixin INFO     There are 1099956224 prunable params which have 1.19% avg sparsity.
2024-05-30 15:57:12 sparseml.transformers.finetune.session_mixin INFO     There are 1099956224 quantizable params, with a quantization percentage of 88.08%.