Open K-Mistele opened 2 months ago
Thank you for this RFC - imo we should support OpenAI compatibility at our best effort so working on finished_reason
aligns with our goal.
I do think there's a value of keeping stop_reason
since finished_reason
can't really be a direct drop-in replacement, especially if some users already rely on for their downstream tasks.
Adding @njhill who initially added this.
Motivation.
This RFC is motivated by a desire to ensure full OpenAI compatibility in Completions & chat completions endpoints by deprecating
stop_reason
which is not part of the OpenAI specification, and fixing thefinish_reason
field's implementation to be OpenAI-compatibile even though doing so may (or may not) be a breaking change for a small number of existing users.Proposed Change.
Suggestion: recommend deprecating
stop_reason
The Chat Completion Object in OpenAI's Documentation indicates a
finish_reason
field, but not astop_reason
. While thestop_reason
doesn't break the OpenAI compatibility since it's just an extra property, it is a source of confusion for contributors such as myself, as well as implementers & users of vLLM.It is unclear when
stop_reason
was introduced to the project or why, although I am sure this could be determined by going back through old issues. Perhaps it is an artifact of an older version of OpenAI, or it was merely a typo?Currently,
stop_reason
is almost alwaysnull
/None
- it does not seem to add any value to the project.Please share your feedback on deprecating and removing the
stop_reason
field from completion & chat completion responses in the OpenAI-compatible entrypoint :)Suggestion: change
finish_reason
behavior for OpenAI compatibilityCurrently, the
finish_reason
field in the completion & chat completion responses of the OpenAI-compatible entrypoint is almost alwaysnull
/None
(RESTful API vs. Python API). Per the OpenAI specification for the chat completion object, thefinish_reason
should be one of the following:length
if the response is terminated early due to a requested maximum lengthstop
if the model stops naturally or a stop sequence is reachedtool_calls
if the model generates tool callscontent_filter
if the content filter terminates the model's response (not applicable for us)function_calls
deprecated bytool_calls
and therefore not applicable to usPer the OpenAI specification,
finish_reason
should never benull
orNone
My recommendation is to fix the implementation of
finish_reason
in these endpoints to ensure maximum OpenAI compatibility. PLEASE NOTE that it is possible that this change may be breaking for vLLM users who are using the OpenAI-compatible endpoint with non-OpenAI tooling and who are relying on aNone
/null
value in thefinish_reason
field, but this change is required for full OpenAI compatibility.Feedback Period.
I am attempting to finish implementing OpenAI-style tool calling in #5649 and would like to implement this fix in that PR, particularly by implementing the
tool_calls
value in thefinish_reason
field where appropriate. I anticipate that PR being ready for review in 1-2 weeks.I would ideally like to have community and maintainer feedback by then, so that I can implement this fix/change and so that it doesn't become a blocker on that PR.
CC List.
@mgoin @WoosukKwon @simon-mo
Any Other Things.
This RFC is pertinent to PR #5649 as the
tool_call
value for thefinish_reason
needs to be implemented, so i would like to properly implement thestop
value as well.