Validation Error found Langsmith

[x] This is actually a bug report.
[ ] I am not getting good LLM Results
[ ] I have tried asking for help in the community on discord or discussions and have not received a response.
[ ] I have tried searching the documentation and have not found an answer.

What Model are you using?

[ ] gpt-3.5-turbo
[ ] gpt-4-turbo
[ ] gpt-4
[ x] Other (please specify): Azure OpenAI gpt-35-turbo-16k

Describe the bug I'm defining a Pydantic class to structure the output of LLM evaluation. When the function asks for integers only, it works perfectly. However, when asking for an explanation string along with the integer, the function throws an error. I only see the error when looking at the traces in LangSmith.

To Reproduce The following code works without errors:

from pydantic import BaseModel, Field

class EvaluationTest(BaseModel):
    about_the_organization: int = Field(ge=1, le=5)
    project_overview: int = Field(ge=1, le=5)
    scope_of_work: int = Field(ge=1, le=5)
    deliverables: int = Field(ge=1, le=5)
    proposal_format: int = Field(ge=1, le=5)
    rfp_issuer_scoring_method: int = Field(ge=1, le=5)

evaluation_test = client.chat.completions.create(
    model=deployment_name,
    response_model=EvaluationTest,
    messages=[
        {
            "role": "system",
            "content": """
            Evaluate the quality of the content in each section of the business proposal outline. 
            For each section, provide a score from 1 to 5, where 5 is perfect.
            """,
        },
        {
            "role": "user", 
            "content": f"Here is the outline: {markdown_content}"
        },
    ]
)

But this code throws an error:

from pydantic import BaseModel, Field

# Define a Pydantic model for the evaluation results
class EvaluationResult(BaseModel):
    about_the_organization: int = Field(ge=1, le=5)
    about_the_organization_rationale: str
    project_overview: int = Field(ge=1, le=5)
    project_overview_rationale: str
    scope_of_work: int = Field(ge=1, le=5)
    scope_of_work_rationale: str
    deliverables: int = Field(ge=1, le=5)
    deliverables_rationale: str
    proposal_format: int = Field(ge=1, le=5)
    proposal_format_rationale: str
    rfp_issuer_scoring_method: int = Field(ge=1, le=5)
    rfp_issuer_scoring_method_rationale: str

evaluation_results = client.chat.completions.create(
    model=deployment_name,
    response_model=EvaluationResult,
    messages=[
        {
            "role": "system",
            "content": """
            Evaluate the quality of the content in each section of the business proposal outline. 
            For each section, provide a score from 1 to 5, where 5 is perfect.
            Additionally, provide a rationale for the given score.
            """,
        },
        {
            "role": "user", 
            "content": f"Here is the outline: {markdown_content}"
        },
    ]
)

Expected behavior I expected a structured output with an integer rating and a string explanation without errors.

Screenshots Taken from Langsmith tracing

jxnl / instructor

Validation Error found Langsmith #788