NolanoOrg / cformers

SoTA Transformers with C-backend for fast inference on your CPU.
MIT License
311 stars 29 forks source link

Explore SparseGPT-style sparsification for models. #5

Open Ayushk4 opened 1 year ago

Ayushk4 commented 1 year ago

Refer https://github.com/NolanoOrg/sparse_quant_llms and https://arxiv.org/abs/2301.00774