Closed gottlike closed 6 months ago
Now all we need is combining this with 1-bit transformers and we can run everything on a cheap phone 😁: https://arxiv.org/abs/2310.11453
Another pruning method!
Thanks @gottlike and @SuperSecureHuman , we have wanda implemented in SparseML https://github.com/neuralmagic/sparseml/blob/3ddc2d458572003e3c8c4cba8e8fc332f13ff9d4/src/sparseml/modifiers/pruning/wanda/pytorch.py#L34 SparseGPT still seems to be a better option for higher levels of sparsity though
Just a FYI of a new discovery that should be super relevant for you guys: https://twitter.com/aaquib_syed1/status/1714386165237776653