neuralmagic / sparseml

Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models
Apache License 2.0
2.07k stars 148 forks source link

Llama2 7B Quantization Examples #2285

Closed Satrat closed 6 months ago

Satrat commented 6 months ago

Creating a new examples folder, with initial examples for llama7b using ultrachat200k

Results

Models

Storing the model outputs on network under /network/sadkins

Eval Results

sparseml.evaluate /network/sadkins/llama1.1b_W4A16_channel_compressed -d wikitext -i lm-evaluation-harness

Missing

review-notebook-app[bot] commented 6 months ago

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB