Pydantic model validator of HumanMessage causing AttributeError when using mixed list of messages in Pydantic model

Chengdyc commented 1 week ago

Checked other resources

[X] I added a very descriptive title to this issue.
[X] I searched the LangChain documentation with the integrated search.
[X] I used the GitHub search to find a similar question and didn't find it.
[X] I am sure that this is a bug in LangChain rather than my code.
[X] The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

from langchain_core.messages import HumanMessage, AIMessage, ToolMessage
from pydantic import BaseModel, Field
import uuid
from typing import List

if __name__ == "__main__":
    human_msg = HumanMessage(content="i'm human")
    ai_msg = AIMessage(content="i'm AI")
    tool_msg = ToolMessage(content="i'm tool", tool_call_id="123")

    class A(BaseModel):
        hist: List[HumanMessage] = Field(default_factory=list)

    class B(BaseModel):
        hist: List[HumanMessage|ToolMessage] = Field(default_factory=list)

    class C(BaseModel):
        hist: List[HumanMessage|AIMessage] = Field(default_factory=list)

    a = A(hist=[human_msg])
    b1 = B(hist=[human_msg])
    b2 = B(hist=[tool_msg])
    b3 = B(hist=[human_msg, tool_msg])
    c1 = C(hist=[ai_msg])
    c2 = C(hist=[human_msg])
    c3 = C(hist=[ai_msg, human_msg])

exception is raised when initializing 'c2' and 'c3'

Error Message and Stack Trace (if applicable)

Traceback (most recent call last):
  File "/home/xxx/workspace/yyy/app/repro_bug.py", line 28, in <module>
    c2 = C(hist=[human_msg])
         ^^^^^^^^^^^^^^^^^^^
  File "/home/xxx/workspace/yyy/venv/lib/python3.11/site-packages/pydantic/main.py", line 212, in __init__
    validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xxx/workspace/yyy/venv/lib/python3.11/site-packages/langchain_core/messages/ai.py", line 117, in _backwards_compat_tool_calls
    check_additional_kwargs = not any(
                                  ^^^^
  File "/home/xxx/workspace/yyy/venv/lib/python3.11/site-packages/langchain_core/messages/ai.py", line 118, in <genexpr>
    values.get(k)
    ^^^^^^^^^^
  File "/home/xxx/workspace/yyy/venv/lib/python3.11/site-packages/pydantic/main.py", line 856, in __getattr__
    raise AttributeError(f'{type(self).__name__!r} object has no attribute {item!r}')
AttributeError: 'HumanMessage' object has no attribute 'get'

Description

I have a pydantic BaseModel object that contains a list of messages from chat history, the type is a list of AIMessage or HumanMessage. After upgrading to Langchain 0.3.1, I started getting errors when creating the object with HumanMessages in the list.

The problem is with the 'values.get()' call in AIMessage's model validator https://github.com/langchain-ai/langchain/blob/master/libs/core/langchain_core/messages/ai.py#L118

the model validator is invoked on HumanMessage's when they are part of the list.

The same code was working fine in langchain 0.2 before upgrading to 0.3

System Info

System Information
------------------
> OS:  Linux
> OS Version:  #1 SMP Fri Mar 29 23:14:13 UTC 2024
> Python Version:  3.11.9 (main, Apr  6 2024, 17:59:24) [GCC 11.4.0]

Package Information
-------------------
> langchain_core: 0.3.6
> langchain: 0.3.1
> langchain_community: 0.3.1
> langsmith: 0.1.125
> langchain_astradb: 0.5.0
> langchain_google_genai: 1.0.10
> langchain_openai: 0.2.1
> langchain_text_splitters: 0.3.0
> langgraph: Installed. No version info available.
> langserve: 0.3.0

Other Dependencies
------------------
> aiohttp: 3.10.5
> astrapy: 1.5.0
> async-timeout: Installed. No version info available.
> dataclasses-json: 0.5.9
> fastapi: 0.112.1
> google-generativeai: 0.7.2
> httpx: 0.27.0
> jsonpatch: 1.33
> numpy: 1.26.4
> openai: 1.50.2
> orjson: 3.10.5
> packaging: 23.2
> pillow: 10.4.0
> pydantic: 2.9.2
> pydantic-settings: 2.5.2
> PyYAML: 6.0.1
> requests: 2.32.3
> SQLAlchemy: 2.0.31
> sse-starlette: 1.8.2
> tenacity: 8.4.1
> tiktoken: 0.7.0
> typing-extensions: 4.12.2

keenborder786 commented 6 days ago

Can you please upgrade your langchain version? I just tried and it seems to be working on the latest version.

Chengdyc commented 6 days ago

Could you specify which version? I'm using langchain 0.3.1 which is the latest https://pypi.org/project/langchain/

Jakolo121 commented 4 days ago

The last release on pypi is indeed 0.3.1. But the latest langchain release is 0.3.7

Chengdyc commented 4 days ago

Thanks, just upgraded to langchain-core 0.3.7 but the problem persists.

$ poetry show langchain-core
 name         : langchain-core
 version      : 0.3.7
 description  : Building applications with LLMs through composability
$ poetry show langchain
 name         : langchain
 version      : 0.3.1
 description  : Building applications with LLMs through composability

If I understand correctly, a type check is needed in the model_validator for AIMessage to narrow the type as suggested by Pydantic documentation.

Jakolo121 commented 2 days ago

Have you tried turning the Kernel off and on again? Worked for me! I also updated langchain_core after I tested your code with Pydantic V1 and V2

Chengdyc commented 2 days ago

Thanks, I'm not running from a notebook environment, just running from console inside a Python virtual env.

source venv/bin/activate
poetry run python3 app/repro_langchain_bug.py

I updated langchain-core to 3.8 and I can still repro the bug.

AttributeError: 'HumanMessage' object has no attribute 'get'

Jakolo121 commented 1 day ago

Did you tried to run your code in a new environment? I tested your code in a new python venv and in a new poetry shell. It runs without an error. I only added print statements at the end.

Poetry Version: 1.8.3 Python: 3.12.7

name = "langchain-core" version = "0.3.8"

name = "pydantic" version = "2.9.2"

command: poetry run python bug.py output:

a: hist=[HumanMessage(content="i'm human", additional_kwargs={}, response_metadata={})]

b1: hist=[HumanMessage(content="i'm human", additional_kwargs={}, response_metadata={})]

b2: hist=[ToolMessage(content="i'm tool", tool_call_id='123')]

b3: hist=[HumanMessage(content="i'm human", additional_kwargs={}, response_metadata={}), ToolMessage(content="i'm tool", tool_call_id='123')]

c1: hist=[AIMessage(content="i'm AI", additional_kwargs={}, response_metadata={})]

c2: hist=[HumanMessage(content="i'm human", additional_kwargs={}, response_metadata={})]

c3: hist=[AIMessage(content="i'm AI", additional_kwargs={}, response_metadata={}), HumanMessage(content="i'm human", additional_kwargs={}, response_metadata={})]

Chengdyc commented 1 day ago

Thanks, must be something with my poetry / python environment. I ended up reinstalling python 11 and poetry. now the code works and doesn't fail any more.

langchain-ai / langchain