issues
search
neuralmagic
/
nm-vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
https://nm-vllm.readthedocs.io
Other
251
stars
10
forks
source link
[ BugFix ] Prompt Logprobs Detokenization (#6223)
#378
Closed
robertgshaw2-neuralmagic
closed
3 months ago
robertgshaw2-neuralmagic
commented
4 months ago
SUMMARY:
cherry pick hotfix
SUMMARY: