neuralmagic / sparseml

Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models
Apache License 2.0
2.07k stars 148 forks source link

Sparse Quantization Example Clarification #2334

Closed Satrat closed 5 months ago

Satrat commented 5 months ago

@dbarbuzzi pointed out during QA testing that for the sparse quantized example it is not clear the code snippets are meant to be run one by one in the same Python instance. Updating the README to make this more explicit. Also updating the quantization config for w8a8 since "tensor" is the strategy we have been testing with for this example.