vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://docs.vllm.ai
Apache License 2.0
26.15k stars 3.83k forks source link

[RFC]: Deprecate stop_reason in OpenAI Entrypoint in favor of finish_reason; fix implementation of finish_reason #6202

Open K-Mistele opened 2 months ago

K-Mistele commented 2 months ago

Motivation.

This RFC is motivated by a desire to ensure full OpenAI compatibility in Completions & chat completions endpoints by deprecating stop_reason which is not part of the OpenAI specification, and fixing the finish_reason field's implementation to be OpenAI-compatibile even though doing so may (or may not) be a breaking change for a small number of existing users.

Proposed Change.

Suggestion: recommend deprecating stop_reason

The Chat Completion Object in OpenAI's Documentation indicates a finish_reason field, but not a stop_reason. While the stop_reason doesn't break the OpenAI compatibility since it's just an extra property, it is a source of confusion for contributors such as myself, as well as implementers & users of vLLM.

It is unclear when stop_reason was introduced to the project or why, although I am sure this could be determined by going back through old issues. Perhaps it is an artifact of an older version of OpenAI, or it was merely a typo?

Currently, stop_reason is almost always null/None - it does not seem to add any value to the project.

Please share your feedback on deprecating and removing the stop_reason field from completion & chat completion responses in the OpenAI-compatible entrypoint :)

Suggestion: change finish_reason behavior for OpenAI compatibility

Currently, the finish_reason field in the completion & chat completion responses of the OpenAI-compatible entrypoint is almost always null/None (RESTful API vs. Python API). Per the OpenAI specification for the chat completion object, the finish_reason should be one of the following:

Per the OpenAI specification, finish_reason should never be null or None

My recommendation is to fix the implementation of finish_reason in these endpoints to ensure maximum OpenAI compatibility. PLEASE NOTE that it is possible that this change may be breaking for vLLM users who are using the OpenAI-compatible endpoint with non-OpenAI tooling and who are relying on a None/null value in the finish_reason field, but this change is required for full OpenAI compatibility.

Feedback Period.

I am attempting to finish implementing OpenAI-style tool calling in #5649 and would like to implement this fix in that PR, particularly by implementing the tool_calls value in the finish_reason field where appropriate. I anticipate that PR being ready for review in 1-2 weeks.

I would ideally like to have community and maintainer feedback by then, so that I can implement this fix/change and so that it doesn't become a blocker on that PR.

CC List.

@mgoin @WoosukKwon @simon-mo

Any Other Things.

This RFC is pertinent to PR #5649 as the tool_call value for the finish_reason needs to be implemented, so i would like to properly implement the stop value as well.

ywang96 commented 2 months ago

Thank you for this RFC - imo we should support OpenAI compatibility at our best effort so working on finished_reason aligns with our goal.

I do think there's a value of keeping stop_reason since finished_reason can't really be a direct drop-in replacement, especially if some users already rely on for their downstream tasks.

simon-mo commented 2 months ago

Adding @njhill who initially added this.