openai / openai-python

The official Python library for the OpenAI API
https://pypi.org/project/openai/
Apache License 2.0
23.02k stars 3.23k forks source link

beta.chat.completions.parse returns unhandled ValidationError #1763

Open marinomaria opened 1 month ago

marinomaria commented 1 month ago

Confirm this is an issue with the Python library and not an underlying OpenAI API

Describe the bug

In some occasions while using the Completion API with Structured Outputs, the SDK fails and returns a ValidationError:

ValidationError: 1 validation error for RawResponse
  Invalid JSON: EOF while parsing a value at line 1 column 600 [type=json_invalid, input_value='                        ...                       ', input_type=str]
    For further information visit https://errors.pydantic.dev/2.9/v/json_invalid

This does not happen every time, but we use it in a production service and this unpredictable behavior is hard to prevent.

To Reproduce

  1. Create a Pydantic model
  2. Instantiate an OpenAI client
  3. Use the method OpenAI.beta.chat.completions.parse(...) with the following arguments
  4. Repeat a few times for seeing the error
from pydantic import BaseModel
from openai import OpenAI

class RawResponse(BaseModel):
    answer: str

client = OpenAI(api_key=...)
completion = client.beta.chat.completions.parse(
                        model='gpt-4o-2024-08-06',
                        messages=messages,
                        max_tokens=750,
                        n=1,
                        stop=None,
                        temperature=0.1,
                        response_format=RawResponse
                    )

After a few times, this fails with:

ValidationError: 1 validation error for RawResponse
  Invalid JSON: EOF while parsing a value at line 1 column 600 [type=json_invalid, input_value='                        ...                       ', input_type=str]
    For further information visit https://errors.pydantic.dev/2.9/v/json_invalid

Code snippets

No response

OS

debian:bullseye-slim

Python version

CPython 3.10.8

Library version

openai 1.48.0

RobertCraigie commented 1 month ago

Thanks for the report, it looks like your example script isn't fully complete, could you share a full script?

marinomaria commented 1 month ago

Hi, thanks for the quick reply! Sadly I can't provide a full script for privacy reasons but I'll be happy to share any information you need for identifying the issue. Here's the traceback:

File "/app/src/core/modules/emiGPT/core/openai_chat_api.py", line 95, in _get_response  
  completion = self._client.beta.chat.completions.parse(    
File "/opt/venv/lib/python3.10/site-packages/openai/resources/beta/chat/completions.py", line 145, in parse 
  return _parse_chat_completion(    
File "/opt/venv/lib/python3.10/site-packages/openai/lib/_parsing/_completions.py", line 110, in parse_chat_completion   
  "parsed": maybe_parse_content(    
File "/opt/venv/lib/python3.10/site-packages/openai/lib/_parsing/_completions.py", line 161, in maybe_parse_content 
  return _parse_content(response_format, message.content)   
File "/opt/venv/lib/python3.10/site-packages/openai/lib/_parsing/_completions.py", line 221, in _parse_content  
  return cast(ResponseFormatT, model_parse_json(response_format, content))  
File "/opt/venv/lib/python3.10/site-packages/openai/_compat.py", line 166, in model_parse_json  
  return model.model_validate_json(data)    
File "/opt/venv/lib/python3.10/site-packages/pydantic/main.py", line 625, in model_validate_json    
  return cls.__pydantic_validator__.validate_json(json_data, strict=strict, context=context)    
pydantic_core._pydantic_core.ValidationError: 1 validation error for RawResponse    
  Invalid JSON: EOF while parsing a value at line 1 column 600 [type=json_invalid, input_value='                        ...                       ', input_type=str]    
    For further information visit https://errors.pydantic.dev/2.9/v/json_invalid

Please let me know if there's anything else you need.

RobertCraigie commented 1 month ago

Could you share a request ID from a failing request? https://github.com/openai/openai-python#request-ids

marinomaria commented 1 month ago

From seeing the logs of our application I understand the call to client.beta.chat.completions.parse(...) resulted in an exception, thus it gave no result from which to extract a request_id. 😞

RobertCraigie commented 1 month ago

ahhhh right sorry, if you don't already have debug logging enabled, could you enable it? https://github.com/openai/openai-python#logging that should show a request ID in the logs

marinomaria commented 1 month ago

Sure thing!

jonomillin commented 4 days ago

Hi there! Any updates here? FYI - same thing happening for me, probably 50% of the time:

openai==1.53.0
pydantic==2.9.2
pydantic_core==2.23.4

I've solved this in the meantime with a tenacity retry, but it's adding latency and calls which isn't ideal...

try:
    @retry(
        stop=stop_after_attempt(3),
        wait=wait_exponential(multiplier=0, min=0, max=0),  # No wait between retries
        retry=retry_if_exception_type(ValidationError),
        before=lambda retry_state: messages.append({
            "role": "system",
            "content": "Raised Exception: pydantic_core._pydantic_core.ValidationError. Please try again and confirm to model specs."
        })
    )
    async def attempt_parse():
        return await async_client.beta.chat.completions.parse(
            model=model,
            messages=messages,
            response_format=MyPydanticModel,
            functions=FUNCTION_LIST,
            function_call="auto",
        )

    response = await attempt_parse()

except ValidationError as e:
    logger.error(f"Failed to parse response after all retries: {e}")
    raise