neuralmagic / deepsparse

Sparsity-aware deep learning inference runtime for CPUs
https://neuralmagic.com/deepsparse/
Other
3.01k stars 176 forks source link

I see that SparseGPT has been integrated into your project. May I know which file the specific modifications are in? How to use it specifically? #1076

Closed 18140663659 closed 1 year ago

18140663659 commented 1 year ago

Describe the bug A clear and concise description of what the bug is.

Expected behavior A clear and concise description of what you expected to happen.

Environment Include all relevant environment information:

  1. OS [e.g. Ubuntu 18.04]:
  2. Python version [e.g. 3.8]:
  3. DeepSparse version or commit hash [e.g. 0.1.0, f7245c8]:
  4. ML framework version(s) [e.g. torch 1.7.1]:
  5. Other Python package versions [e.g. SparseML, Sparsify, numpy, ONNX]:
  6. CPU info - output of deepsparse/src/deepsparse/arch.bin or output of cpu_architecture() as follows:
    >>> import deepsparse.cpu
    >>> print(deepsparse.cpu.cpu_architecture())

To Reproduce Exact steps to reproduce the behavior:

Errors If applicable, add a full print-out of any errors or exceptions that are raised or include screenshots to help explain your problem.

Additional context Add any other context about the problem here. Also include any relevant files.

bfineran commented 1 year ago

Hi @18140663659 deepsparse is adding support for a text generation pipeline that supports running SparseGPT and other generative LLMs (#1064).

As for actually sparsifying models to replicate SparseGPT we are soon releasing a major update to Sparsify that includes this @jeanniefinks can provide more info on its release!

jeanniefinks commented 1 year ago

Hello @18140663659 As @bfineran mentioned, we are working on the next generation of Sparsify to enable optimizations like SparseGPT to be applied to your own models or generic use cases through a web app and local one-command APIs.

This Sparsify Alpha is set to release next week. If you want be notified when it goes live, fill out this form: https://neuralmagic.com/request-early-access-to-sparsify/ Specifically to use the SparseGPT algorithm on your models, you'll want to check out the Sparsify Alpha's One-Shot Pathway once it's live. More to come!

jeanniefinks commented 1 year ago

hi @18140663659 The Sparsify Alpha mentioned in the last comment is now live. Because it is an alpha, we are inviting a small subset of users like yourself to try it out and let us know what you think. Check out https://github.com/neuralmagic/sparsify. I will close out this thread for now but feel free to re-open as needed!