neuralmagic / deepsparse

Sparsity-aware deep learning inference runtime for CPUs
https://neuralmagic.com/deepsparse/
Other
2.94k stars 169 forks source link

[Text Generation] Consolidate `TextGenerationPipeline` and `TextGenerationPipelineNoKVCache` #1510

Closed dbogunowicz closed 5 months ago

dbogunowicz commented 5 months ago

@dsikka all units/integrations tests pass with #1523 included. I confirm that all the aforementioned properties are still present and functional in both "incarnations" of TextGenerationPipeline