neuralmagic / deepsparse

Sparsity-aware deep learning inference runtime for CPUs
https://neuralmagic.com/deepsparse/
Other
2.97k stars 171 forks source link

Add LLMPerf example for DeepSparse LLM Server #1502

Closed mgoin closed 7 months ago

mgoin commented 8 months ago

Simple example walking through how to run LLMPerf benchmarks and how to analyze the output statistics

inter_token_vs_throughput