beta.chat.completions.parse returns unhandled ValidationError

marinomaria commented 1 month ago

Confirm this is an issue with the Python library and not an underlying OpenAI API

[X] This is an issue with the Python library

Describe the bug

In some occasions while using the Completion API with Structured Outputs, the SDK fails and returns a ValidationError:

ValidationError: 1 validation error for RawResponse
  Invalid JSON: EOF while parsing a value at line 1 column 600 [type=json_invalid, input_value='                        ...                       ', input_type=str]
    For further information visit https://errors.pydantic.dev/2.9/v/json_invalid

This does not happen every time, but we use it in a production service and this unpredictable behavior is hard to prevent.

To Reproduce

Create a Pydantic model
Instantiate an OpenAI client
Use the method OpenAI.beta.chat.completions.parse(...) with the following arguments
Repeat a few times for seeing the error

from pydantic import BaseModel
from openai import OpenAI

class RawResponse(BaseModel):
    answer: str

client = OpenAI(api_key=...)
completion = client.beta.chat.completions.parse(
                        model='gpt-4o-2024-08-06',
                        messages=messages,
                        max_tokens=750,
                        n=1,
                        stop=None,
                        temperature=0.1,
                        response_format=RawResponse
                    )

After a few times, this fails with:

ValidationError: 1 validation error for RawResponse
  Invalid JSON: EOF while parsing a value at line 1 column 600 [type=json_invalid, input_value='                        ...                       ', input_type=str]
    For further information visit https://errors.pydantic.dev/2.9/v/json_invalid

Code snippets

No response

OS

debian:bullseye-slim

Python version

CPython 3.10.8

Library version

openai 1.48.0

RobertCraigie commented 1 month ago

Thanks for the report, it looks like your example script isn't fully complete, could you share a full script?

marinomaria commented 1 month ago

Hi, thanks for the quick reply! Sadly I can't provide a full script for privacy reasons but I'll be happy to share any information you need for identifying the issue. Here's the traceback:

File "/app/src/core/modules/emiGPT/core/openai_chat_api.py", line 95, in _get_response  
  completion = self._client.beta.chat.completions.parse(    
File "/opt/venv/lib/python3.10/site-packages/openai/resources/beta/chat/completions.py", line 145, in parse 
  return _parse_chat_completion(    
File "/opt/venv/lib/python3.10/site-packages/openai/lib/_parsing/_completions.py", line 110, in parse_chat_completion   
  "parsed": maybe_parse_content(    
File "/opt/venv/lib/python3.10/site-packages/openai/lib/_parsing/_completions.py", line 161, in maybe_parse_content 
  return _parse_content(response_format, message.content)   
File "/opt/venv/lib/python3.10/site-packages/openai/lib/_parsing/_completions.py", line 221, in _parse_content  
  return cast(ResponseFormatT, model_parse_json(response_format, content))  
File "/opt/venv/lib/python3.10/site-packages/openai/_compat.py", line 166, in model_parse_json  
  return model.model_validate_json(data)    
File "/opt/venv/lib/python3.10/site-packages/pydantic/main.py", line 625, in model_validate_json    
  return cls.__pydantic_validator__.validate_json(json_data, strict=strict, context=context)    
pydantic_core._pydantic_core.ValidationError: 1 validation error for RawResponse    
  Invalid JSON: EOF while parsing a value at line 1 column 600 [type=json_invalid, input_value='                        ...                       ', input_type=str]    
    For further information visit https://errors.pydantic.dev/2.9/v/json_invalid

Please let me know if there's anything else you need.

RobertCraigie commented 1 month ago

Could you share a request ID from a failing request? https://github.com/openai/openai-python#request-ids

marinomaria commented 1 month ago

From seeing the logs of our application I understand the call to client.beta.chat.completions.parse(...) resulted in an exception, thus it gave no result from which to extract a request_id. 😞

RobertCraigie commented 1 month ago

ahhhh right sorry, if you don't already have debug logging enabled, could you enable it? https://github.com/openai/openai-python#logging that should show a request ID in the logs

marinomaria commented 1 month ago

Sure thing!

jonomillin commented 4 days ago

Hi there! Any updates here? FYI - same thing happening for me, probably 50% of the time:

openai==1.53.0
pydantic==2.9.2
pydantic_core==2.23.4

I've solved this in the meantime with a tenacity retry, but it's adding latency and calls which isn't ideal...

try:
    @retry(
        stop=stop_after_attempt(3),
        wait=wait_exponential(multiplier=0, min=0, max=0),  # No wait between retries
        retry=retry_if_exception_type(ValidationError),
        before=lambda retry_state: messages.append({
            "role": "system",
            "content": "Raised Exception: pydantic_core._pydantic_core.ValidationError. Please try again and confirm to model specs."
        })
    )
    async def attempt_parse():
        return await async_client.beta.chat.completions.parse(
            model=model,
            messages=messages,
            response_format=MyPydanticModel,
            functions=FUNCTION_LIST,
            function_call="auto",
        )

    response = await attempt_parse()

except ValidationError as e:
    logger.error(f"Failed to parse response after all retries: {e}")
    raise

openai / openai-python