tloen / alpaca-lora

Instruct-tune LLaMA on consumer hardware
Apache License 2.0
18.68k stars 2.22k forks source link

Sparsify (prune) models for performance #323

Open claysauruswrecks opened 1 year ago

claysauruswrecks commented 1 year ago

Orders of magnitude performance increases have been observed: https://github.com/mlcommons/inference_results_v3.0/tree/main/open/NeuralMagic

claysauruswrecks commented 1 year ago

https://github.com/neuralmagic/deepsparse