vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://docs.vllm.ai
Apache License 2.0
26.48k stars 3.88k forks source link

Support JSON mode. #2483

Open MiyazonoKaori opened 7 months ago

MiyazonoKaori commented 7 months ago

Any plan to integrate modules such as lm-format-enforcer. Support JSON mode.

simon-mo commented 7 months ago

Yes! This is our top priorities.

hadsed commented 7 months ago

Is there currently any work happening here? I could look into it if not.

fullstackwebdev commented 7 months ago

context-free grammar FwDTXcNaYAIafOg

it would be cool if the llama cpp GBNF grammars could be standardize so we can do stuff like the image , between different LLM frameworks.

simon-mo commented 7 months ago

In our roadmap, anything https://github.com/outlines-dev/outlines supports can be part of vLLM, as long as any other frameworks that support LogitsProcessors API.

Kaotic3 commented 7 months ago

This looks good, just read through outlines - seems super useful.

When do we get it :D

findalexli commented 6 months ago

Have someone done benchmark on latency introduced by outlines?

solesensei commented 1 month ago

Is this already supported as described here https://github.com/vllm-project/vllm/issues/1191?

ChuckHend commented 1 month ago

I would also love to see this feature make it into vllm!

wxgeorge commented 3 weeks ago

Is this already supported as described here #1191?

I suspect the same.

Specifically https://github.com/vllm-project/vllm/pull/3211 implements handling for "request_format": { "type": "json_object" } in the completion request body, which is what I'm familiar with when we say "JSON mode".

ChuckHend commented 3 weeks ago

"request_format": { "type": "json_object" } seems to work for me. Also requires a prompt specifically asking for json response.