Background vLLM currently supports various model features through configuration parameters, but lacks support for passing additional model-specific parameters through extra_body, which is particularly important for features like structured output. https://github.com/vllm-project/vllm/blob/v0.6.0/vllm/engine/arg_utils.py#L276

Current OpenAI implementation

completion = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[{"role": "user", "content": "Generate a user profile"}],
    extra_body={
        "guided_json": Test.schema_json,
        "guided_decoding_backend": "lm-format-enforcer"
    }
)

Proposed implementation

resp, err := integrations.LLMClient.Client.CreateChatCompletion(
        ctx,
        openai.ChatCompletionRequest{
            Model: "...",
            Messages: []openai.ChatCompletionMessage{
                ...
            },
            ExtraBody: map[string]any{
                ....
            },
        },
    )

sashabaranov / go-openai

Feature Request: Support extra_body Parameter for Advanced Model Features #898

Current OpenAI implementation

Proposed implementation