ChatAnthropicVertex + AgentExecutor => not consistent output with other models

I am encountering the same problem and thought I may detail it here instead of opening a new thread. I can create another if needed.

In particular what I find is that when using Claude models (I am using sonnet 3.5 - claude-3-5-sonnet@20240620) wrapped in the agent executor class the output causes problems when handling the conversation memory.

I initialize the models like this:

model_pro = ChatVertexAI(model_name="gemini-1.5-pro-002", temperature=0.0, location="europe-west1") # Really important "VertexAI" does not have a bind_tools method

model_claude = ChatAnthropicVertex(model_name="claude-3-5-sonnet@20240620", project=project, location="europe-west1")

At first sight everything is ok and the outputs are consistent and they are an instance of AIMessage in both cases:

model_pro.invoke(input="hi there")

output:

AIMessage(content='Hi there! How can I help you today?\n', response_metadata={'is_blocked': False, 'safety_ratings': [], 'usage_metadata': {'prompt_token_count': 2, 'candidates_token_count': 11, 'total_token_count': 13}, 'finish_reason': 'STOP'}, id='run-52b77124-fd7a-42aa-a088-1d9a3ec25aee-0', usage_metadata={'input_tokens': 2, 'output_tokens': 11, 'total_tokens': 13})

model_claude.invoke(input="hi there")

output:

AIMessage(content='Hello! How can I assist you today? Feel free to ask me any questions or let me know if you need help with anything.', response_metadata={'id': 'msg_vrtx_01JFjAYNUVxPiG6zPxC6f5dm', 'model': 'claude-3-5-sonnet-20240620', 'stop_reason': 'end_turn', 'stop_sequence': None, 'usage': {'input_tokens': 9, 'output_tokens': 30}}, id='run-a4d4090d-0013-49b0-91ee-22d612453575-0', usage_metadata={'input_tokens': 9, 'output_tokens': 30, 'total_tokens': 39})

However when wrapping them in the AgentExecutor class there is a difference in the outputs:

from langchain.agents import create_tool_calling_agent
from langchain.agents import AgentExecutor

master_agent_gpro = create_tool_calling_agent(model_pro, [], prompt)
master_agent_executor_gpro = AgentExecutor(agent=master_agent_gpro, tools=[], verbose=True)
master_agent_executor_gpro.invoke({"input": "how are you"})

output:

{'input': 'how are you',
 'output': "I'm doing well, thank you for asking! How are you today?\n"}

master_agent_executor_claude = AgentExecutor(agent=master_agent_claude, tools=[], verbose=True)
master_agent_executor_claude.invoke({"input": "how are you"})

output:

{'input': 'how are you',
 'output': [{'text': "As an AI language model, I don't have feelings or personal experiences, but I'm functioning well and ready to assist you with any questions or information you need. How can I help you today?",
   'type': 'text',
   'index': 0}]}

Seeing this, not surprisingly a conversation with memory works well when using the model alone, but not when using the AgentExecutor instance of the model:

from langchain.memory import ConversationSummaryBufferMemory

prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant. Answer all questions to the best of your ability.",
        ),
        ("placeholder", "{chat_history}"),
        ("human", "{input}"),
    ]
)

chain = prompt | model_claude

demo_summary_buffer_history = ConversationSummaryBufferMemory(llm=model, max_token_limit=40, return_messages=True)

store = {}  # memory is maintained outside the chain

def get_session_history(session_id: str) -> InMemoryChatMessageHistory:
    if session_id not in store:
        store[session_id] = InMemoryChatMessageHistory()
        return store[session_id]

    memory = ConversationBufferWindowMemory(
        chat_memory=store[session_id],
        k=4,
        return_messages=True,
    )
    assert len(memory.memory_variables) == 1
    key = memory.memory_variables[0]
    messages = memory.load_memory_variables({})[key]
    store[session_id] = InMemoryChatMessageHistory(messages=messages)
    return store[session_id]

chain_with_message_history = RunnableWithMessageHistory(
    chain,
    # lambda session_id: demo_summary_buffer_history.chat_memory,
    get_session_history,
    input_messages_key="input",
    history_messages_key="chat_history",
)

Calling the chain_with_message_history:

chain_with_message_history.invoke({"input": "hi there"}, {"configurable": {"session_id": "unused"}})

output:

AIMessage(content="Hello! How can I assist you today? Feel free to ask any questions or let me know if there's anything you'd like help with.", response_metadata={'id': 'msg_vrtx_0151pxvP5RHDUo24gqWHo3XK', 'model': 'claude-3-5-sonnet-20240620', 'stop_reason': 'end_turn', 'stop_sequence': None, 'usage': {'input_tokens': 25, 'output_tokens': 32}}, id='run-731fcb10-e650-4ac8-94b7-c4fd5c3b9966-0', usage_metadata={'input_tokens': 25, 'output_tokens': 32, 'total_tokens': 57})

and again a 2nd time:

chain_with_message_history.invoke({"input": "how are you"}, {"configurable": {"session_id": "unused"}})

output:

AIMessage(content="As an AI language model, I don't have feelings, but I'm functioning well and ready to assist you with any questions or tasks you might have. How can I help you today?", response_metadata={'id': 'msg_vrtx_0134Lagkyxna8RU4YLSLfrno', 'model': 'claude-3-5-sonnet-20240620', 'stop_reason': 'end_turn', 'stop_sequence': None, 'usage': {'input_tokens': 63, 'output_tokens': 41}}, id='run-866b5e11-d16d-40a0-89ed-99eb97ba98b7-0', usage_metadata={'input_tokens': 63, 'output_tokens': 41, 'total_tokens': 104})

We encounter no issues and even the store content is ok:

store["unused"]

InMemoryChatMessageHistory(messages=[HumanMessage(content='hi there'), AIMessage(content="Hello! How can I assist you today? Feel free to ask any questions or let me know if there's anything you'd like help with.", response_metadata={'id': 'msg_vrtx_0151pxvP5RHDUo24gqWHo3XK', 'model': 'claude-3-5-sonnet-20240620', 'stop_reason': 'end_turn', 'stop_sequence': None, 'usage': {'input_tokens': 25, 'output_tokens': 32}}, id='run-731fcb10-e650-4ac8-94b7-c4fd5c3b9966-0', usage_metadata={'input_tokens': 25, 'output_tokens': 32, 'total_tokens': 57}), HumanMessage(content='how are you'), AIMessage(content="As an AI language model, I don't have feelings, but I'm functioning well and ready to assist you with any questions or tasks you might have. How can I help you today?", response_metadata={'id': 'msg_vrtx_0134Lagkyxna8RU4YLSLfrno', 'model': 'claude-3-5-sonnet-20240620', 'stop_reason': 'end_turn', 'stop_sequence': None, 'usage': {'input_tokens': 63, 'output_tokens': 41}}, id='run-866b5e11-d16d-40a0-89ed-99eb97ba98b7-0', usage_metadata={'input_tokens': 63, 'output_tokens': 41, 'total_tokens': 104})])

However, if we do the same with and AgentExecutor instance:

store = {}  # memory is maintained outside the chain

def get_session_history(session_id: str) -> InMemoryChatMessageHistory:
    if session_id not in store:
        store[session_id] = InMemoryChatMessageHistory()
        return store[session_id]

    memory = ConversationBufferWindowMemory(
        chat_memory=store[session_id],
        k=4,
        return_messages=True,
    )
    assert len(memory.memory_variables) == 1
    key = memory.memory_variables[0]
    messages = memory.load_memory_variables({})[key]
    store[session_id] = InMemoryChatMessageHistory(messages=messages)
    return store[session_id]

chain_with_message_history = RunnableWithMessageHistory(
    master_agent_executor_claude,
    # lambda session_id: demo_summary_buffer_history.chat_memory,
    get_session_history,
    input_messages_key="input",
    history_messages_key="chat_history",
)

chain_with_message_history.invoke({"input": "hi there"}, {"configurable": {"session_id": "unused"}})

output:

{'input': 'hi there',
 'chat_history': [],
 'output': [{'text': "Hello! How can I assist you today? I'm here to help with any questions you might have or tasks you need help with. Feel free to ask about any topic, and I'll do my best to provide you with helpful information or guidance.",
   'type': 'text',
   'index': 0}]}

but calling it a 2nd time (uses the memory):

chain_with_message_history.invoke({"input": "how are you"}, {"configurable": {"session_id": "unused"}})

output:

...
ValidationError: 1 validation error for InMemoryChatMessageHistory
messages -> 1
  BaseMessage.__init__() missing 1 required positional argument: 'content' (type=type_error)

Maybe there is something else that should be done that I am not seeing but I have no such issue using the other models from vertex AI (Gemini pro or flash)

Running on python 3.10 with the following langchain dependencies:

langchain                     0.2.14
langchain-community           0.2.12
langchain-core                0.2.35
langchain-google-vertexai     1.0.10

langchain-ai / langchain-google

ChatAnthropicVertex + AgentExecutor => not consistent output with other models #368