ninehills / llm-inference-benchmark

LLM Inference benchmark
MIT License
228 stars 6 forks source link

Why is the inference FTL@1 longer after the vllm framework is quantized? #1

Open luhairong11 opened 1 month ago

luhairong11 commented 1 month ago

image image

ninehills commented 1 month ago

vLLM has already fixed this issue.

I will retest soon.