neuralmagic / deepsparse

Sparsity-aware deep learning inference runtime for CPUs
https://neuralmagic.com/deepsparse/
Other
2.94k stars 169 forks source link

[TextGeneration] Fix internal kv_cache update for batch_size > 1 #1514

Closed dsikka closed 6 months ago

dsikka commented 6 months ago

Summary