neuralmagic / deepsparse

Sparsity-aware deep learning inference runtime for CPUs
https://neuralmagic.com/deepsparse/
Other
2.97k stars 171 forks source link

[Pipeline Refactor][Text Generation][Continuous Batching] Integration #1409

Closed dsikka closed 9 months ago

dsikka commented 10 months ago

Summary

Testing

dbogunowicz commented 9 months ago

Conditional approval -> please run the relevant integration/unit tests before landing