neuralmagic / deepsparse

Sparsity-aware deep learning inference runtime for CPUs
https://neuralmagic.com/deepsparse/
Other
2.97k stars 171 forks source link

Add Text Gen Alias #1487

Closed dsikka closed 8 months ago

dbogunowicz commented 8 months ago

Well...I know that this is an attempt to bring back the same functionalities as in the v1, but let's be realistic about it. I'd definitely not put bloom in there, because we do not support it, I'd much rather put MPT and llama / llama2 in the alias, since this is what the people are most likely to use.

dsikka commented 8 months ago

I can remove bloom but we still seem to support kv_cache injection for it?

dbogunowicz commented 8 months ago

@dsikka but its not in sparsezoo and we do not treat it as productionized in our LLM portfolio.