langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
91.87k stars 14.59k forks source link

ChatMistralAI with_structured_output does not recognize BaseModel subclass #22390

Open ruze00 opened 3 months ago

ruze00 commented 3 months ago

Checked other resources

Example Code

class Code(BaseModel):
    prefix: str = Field(description="Description of the problem and approach")
    imports: str = Field(description="Code block import statements")
    code: str = Field(description="Code block not including import statements")

messages = state["messages"]
...

llm = ChatMistralAI(model="codestral-latest", temperature=0, endpoint="https://codestral.mistral.ai/v1")
code_gen_chain = llm.with_structured_output(Code, include_raw=False)
code_solution = code_gen_chain.invoke(messages)

code_solution is always a dict not type Code.

Error Message and Stack Trace (if applicable)

No response

Description

This is code from a recent public codestral demo:

class Code(BaseModel):
    prefix: str = Field(description="Description of the problem and approach")
    imports: str = Field(description="Code block import statements")
    code: str = Field(description="Code block not including import statements")

messages = state["messages"]
...

llm = ChatMistralAI(model="codestral-latest", temperature=0, endpoint="https://codestral.mistral.ai/v1")
code_gen_chain = llm.with_structured_output(Code, include_raw=False)
code_solution = code_gen_chain.invoke(messages)

code_solution is always a dict not type Code.

Stepping into llm.with_structured_output, the first lines are:

if kwargs:
    raise ValueError(f"Received unsupported arguments {kwargs}")
is_pydantic_schema = isinstance(schema, type) and issubclass(schema, BaseModel)

issubclass(schema, BaseModel) always returns False even though schema is the same Code type being sent in.

Before the call:

>>> Code
<class 'codestral.model.Code'>
>>> issubclass(Code, BaseModel)
True
>>> type(Code)
<class 'pydantic._internal._model_construction.ModelMetaclass'>

Step inside the call:

>>> schema
<class 'codestral.model.Code'>
>>> issubclass(schema, BaseModel)
False
>>> type(schema)
<class 'pydantic._internal._model_construction.ModelMetaclass'>

It behaves correctly outside the call to Langchain and incorrectly inside the call.

System Info

langchain==0.2.1 langchain-community==0.2.1 langchain-core==0.2.3 langchain-mistralai==0.1.7 langchain-text-splitters==0.2.0 pydantic==2.7.2 pydantic_core==2.18.3

System Information

OS: Darwin OS Version: Darwin Kernel Version 23.5.0: Wed May 1 20:14:38 PDT 2024; root:xnu-10063.121.3~5/RELEASE_ARM64_T6020 Python Version: 3.11.9 (main, Apr 19 2024, 11:43:47) [Clang 14.0.6 ]

Package Information

langchain_core: 0.2.3 langchain: 0.2.1 langchain_community: 0.2.1 langsmith: 0.1.67 langchain_mistralai: 0.1.7 langchain_text_splitters: 0.2.0 langgraph: 0.0.60

ruze00 commented 3 months ago

On further sleuthing, I discovered ChatMistralAI is expecting Pydantic V1 models while langchain installs Pydantic V2 by default. Might make this clear in some way or fix the code.