Structured outputs `response_format` requires `strict` function calling JSON Schema?

moonbox3 commented 1 month ago

Confirm this is an issue with the Python library and not an underlying OpenAI API

[X] This is an issue with the Python library

Describe the bug

I am using the OpenAI Python 1.47.0 library and the model gpt-4o-2024-08-06. I've got the json_schema response format working with Pydantic/Non-Pydantic models (non-pydantic meaning I manually create the proper response format JSON schema) without tool calling. However, when I attempt to send tools with the payload to the method:

client.beta.chat.completions.parse(...)

I am getting a 400 because the tool's JSON schema does not have strict/additionalProperties.

The error shows as:

ValueError('`weather-get_weather_for_city` is not strict. Only `strict` function tools can be auto-parsed')

When I do add the strict: True and additionalProperties: False, I get a 200:

{
    "type": "function",
    "function": {
        "name": "weather-get_weather_for_city",
        "description": "Get the weather for a city",
        "parameters": {
            "type": "object",
            "properties": {
                "city": {
                    "type": "string",
                    "description": "The input city"
                }
            },
            "required": ["city"],
            "additionalProperties": false
        },
        "strict": true
    }
}

In your docs, I don't see this coupling between function calling schema and json_schema response format called out (if it is there, I am obviously missing it).

The docs say:

Structured Outputs is available in two forms in the OpenAI API:

- When using [function calling](https://platform.openai.com/docs/guides/function-calling)
- When using a json_schema response format

This makes it seem like they're able to be used independently.

As an additional note: in .Net, I can use the OpenAI library and make a call to the normal chat completions endpoint, configure the proper strict JSON Schema for the json_schema response format, and not need to manipulate the function calling JSON schema to include strict or additionalParameters and the calls work fine. No 400s encountered. Something like this:

chatCompletion = (await RunRequestAsync(() => this.Client!.GetChatClient(targetModel).CompleteChatAsync(chatForRequest, chatOptions, cancellationToken)).ConfigureAwait(false)).Value;

To Reproduce

Use the latest OpenAI package
Configure a Pydantic model as the response_format
Include a tool (with non-strict JSON Schema) with the payload
Make a call to client.beta.chat.completions.parse(...)
Observe the 400 due to the function calling schema missing the strict/additionalProperties keys/values.

Code snippets

No response

OS

MacOS

Python version

Python 3.12.5

Library version

openai 1.47.0

RobertCraigie commented 4 weeks ago

Hi @moonbox3, the beta.chat.completions.parse() method currently requires strict response_format and strict tools because we make assumptions in the types that function tools can always be parsed and in general the purpose of the .parse() method is to support auto-parsing so we didn't think it made sense to supports tools or response formats that we couldn't guarantee to be parseable.

In your docs, I don't see this coupling between function calling schema and json_schema response format called out (if it is there, I am obviously missing it).

The docs are right that there isn't any coupling between function calling and response formats as the OpenAPI API doesn't require any coupling, it's just the .parse() SDK helper method that requires parseable inputs.

Does that help answer your question?

moonbox3 commented 4 weeks ago

Hi @RobertCraigie, thanks for your reply, I appreciate it. I'm really hoping to be able to support the json_schema response format only right now, without having to manage the strict tools. Given that this is present/allowed in .Net, I would expect the same in other OpenAI SDKs like Python.

Is there another way I can handle a chat completion json_schema response format without using the .parse() method that would force the strict tools? Or .parse() is the only way?

Thanks for your help.

RobertCraigie commented 4 weeks ago

We'll be shipping a public API for this shortly but for now your best bet would be to use our internal API for converting a type to a response_format and then parse it yourself:

from typing import List
import rich
from pydantic import BaseModel
from openai import OpenAI
from openai.lib._parsing._completions import type_to_response_format_param

class Step(BaseModel):
    explanation: str
    output: str

class MathResponse(BaseModel):
    steps: List[Step]
    final_answer: str

client = OpenAI()
completion = client.chat.completions.create(
    model="gpt-4o-2024-08-06",
    messages=[
        {"role": "system", "content": "You are a helpful math tutor."},
        {"role": "user", "content": "solve 8x + 31 = 2"},
    ],
    response_format=type_to_response_format_param(MathResponse),
)

message = completion.choices[0].message
if message.content:
    parsed = MathResponse.model_validate_json(message.content)
    rich.print(parsed)
else:
    print(message.refusal)

This will work as the .create() method doesn't impose any additional restrictions on the input, it just passes everything straight to the API.

moonbox3 commented 4 weeks ago

@RobertCraigie thanks for your help! I will have a look at this.

openai / openai-python