langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
94.04k stars 15.17k forks source link

OpenAI refusals for structured output not added to `AIMessageChunk.additional_kwargs` when a dict is passed as the schema to `ChatOpenAI.with_structured_output` #25510

Open Saran33 opened 2 months ago

Saran33 commented 2 months ago

Checked other resources

Example Code

import os
from pprint import pprint as pp

from dotenv import load_dotenv
from getpass import getpass
from langchain_core.messages import BaseMessage
from langchain_core.runnables import Runnable
from langchain_openai import ChatOpenAI
from pydantic import BaseModel

load_dotenv()

if not os.getenv("OPENAI_API_KEY"):
    print("Please enter your OpenAI API key")
    os.environ["OPENAI_API_KEY"] = getpass()

class Step(BaseModel):
    explanation: str
    output: str

class Reasoning(BaseModel):
    steps: list[Step]
    final_answer: str

llm = ChatOpenAI(model="gpt-4o-2024-08-06", temperature=0.7)

model = ChatOpenAI(model="gpt-4o-2024-08-06")

chain_of_thought = model.with_structured_output(
    Reasoning, method="json_schema", include_raw=True
)

messages = [
    {
        "role": "system",
        "content": "Guide the user through the solution step by step. If something is unethical or illegal, refuse to answer.",
    },
    {
        "role": "user",
        "content": """How can I commit murder with only one toothbrush and a pencil sharpener in prison?""",
    },
]

def stream_chunks(chain_of_thought: Runnable, messages: list[BaseMessage] | list[dict]):
    try:
        for chunk in chain_of_thought.stream(messages):
            print(chunk)
    except OpenAIRefusalError:
        pass

# correctly adds `refusal` property to the response message
stream_chunks(chain_of_thought, messages)
langchain_openai/chat_models/base.py:539: UserWarning: Streaming with Pydantic response_format not yet supported.
  warnings.warn("Streaming with Pydantic response_format not yet supported.")
{'raw': AIMessageChunk(content='', additional_kwargs={'parsed': None, 'refusal': "I'm sorry, I cannot assist with that request."}, response_metadata={'finish_reason': 'stop', 'logprobs': None}, id='run-04d78dd1-9b2a-4d6d-8042-7a3a63097e8f', usage_metadata={'input_tokens': 53, 'output_tokens': 11, 'total_tokens': 64})}
{'parsing_error': None}
resp_format_as_dict = {
    "name": "Reasoning",
    "description": "Reason through steps to explain a solution.",
    "parameters": {
        "type": "object",
        "properties": {
            "steps": {
                "items": {"$ref": "#/$defs/Step"},
                "title": "Steps",
                "type": "array",
            },
            "final_answer": {"title": "Final Answer", "type": "string"},
        },
        "required": ["steps", "final_answer"],
        "$defs": {
            "Step": {
                "properties": {
                    "explanation": {"title": "Explanation", "type": "string"},
                    "output": {"title": "Output", "type": "string"},
                },
                "required": ["explanation", "output"],
                "title": "Step",
                "type": "object",
                "additionalProperties": False,
            }
        },
        "title": "Reasoning",
        "additionalProperties": False,
    },
    "strict": True,
}

chain_of_thought = model.with_structured_output(
    resp_format_as_dict, method="json_schema", include_raw=True
)

# doesn't add the `refusal` property to the response message
stream_chunks(chain_of_thought, messages)
{'raw': AIMessageChunk(content='', response_metadata={'finish_reason': 'stop', 'model_name': 'gpt-4o-2024-08-06', 'system_fingerprint': 'fp_2a322c9ffc'}, id='run-1d23dbeb-b72b-42ac-8bbf-e437bc17fe0f')}
{'parsing_error': None}
messages[-1]["content"] = "What is 1 + 17 ^2?"

# correctly adds `refusal` property (None) to the response message
stream_chunks(chain_of_thought, messages)
langchain_openai/chat_models/base.py:539: UserWarning: Streaming with Pydantic response_format not yet supported.
  warnings.warn("Streaming with Pydantic response_format not yet supported.")
{'raw': AIMessageChunk(content='{"steps":[{"explanation":"First, calculate the exponentiation. Raise 17 to the power of 2.","output":"17 ^ 2 = 289"},{"explanation":"Next, add 1 to the result obtained from the exponentiation.","output":"1 + 289"}],"final_answer":"290"}', additional_kwargs={'parsed': Reasoning(steps=[Step(explanation='First, calculate the exponentiation. Raise 17 to the power of 2.', output='17 ^ 2 = 289'), Step(explanation='Next, add 1 to the result obtained from the exponentiation.', output='1 + 289')], final_answer='290'), 'refusal': None}, response_metadata={'finish_reason': 'stop', 'logprobs': None}, id='run-6f2f2c55-46bd-4a05-8830-4ef84ddfe402', usage_metadata={'input_tokens': 46, 'output_tokens': 64, 'total_tokens': 110})}
{'parsed': Reasoning(steps=[Step(explanation='First, calculate the exponentiation. Raise 17 to the power of 2.', output='17 ^ 2 = 289'), Step(explanation='Next, add 1 to the result obtained from the exponentiation.', output='1 + 289')], final_answer='290')}
{'parsing_error': None}

Error Message and Stack Trace (if applicable)

No response

Description

System Info

System Information

OS: Darwin OS Version: Darwin Kernel Version 23.5.0: Wed May 1 20:14:38 PDT 2024; root:xnu-10063.121.3~5/RELEASE_ARM64_T6020 Python Version: 3.11.4 (main, Jul 27 2023, 23:35:36) [Clang 14.0.3 (clang-1403.0.22.14.1)]

Package Information

langchain_core: 0.2.32 langchain: 0.2.14 langchain_community: 0.2.12 langsmith: 0.1.93 langchain_openai: 0.1.21 langchain_text_splitters: 0.2.2

Optional packages not installed

langgraph langserve

Other Dependencies

aiohttp: 3.9.5 async-timeout: Installed. No version info available. dataclasses-json: 0.6.7 jsonpatch: 1.33 numpy: 1.26.4 openai: 1.40.6 orjson: 3.10.6 packaging: 23.2 pydantic: 2.8.2 PyYAML: 6.0.1 requests: 2.32.3 SQLAlchemy: 2.0.31 tenacity: 8.5.0 tiktoken: 0.7.0 typing-extensions: 4.12.2

Coder-Yu commented 2 weeks ago

same here. So far the issue hasn't been solved yet.