langchain-ai / langchain-aws

Build LangChain Applications on AWS
MIT License
104 stars 81 forks source link

Redundant Tool Usage output being returned together with text outputs in the content field of AIMessage in ChatBedrockConverse. #97

Closed thiagotps closed 3 months ago

thiagotps commented 4 months ago

When running the following code

from langchain_core.prompts import ChatPromptTemplate
from langchain_aws import ChatBedrockConverse, ChatBedrock
from langchain_core.runnables import RunnableAssign,RunnablePassthrough
from langchain_core.messages import HumanMessage, AIMessage
from langchain_core.output_parsers import StrOutputParser
import uuid
from langchain_core.tools import tool
from typing import Annotated, Literal

@tool
def math_tool(op: Annotated[Literal["sum" , "sub" , "mul" , "div"], "The math operation to be perfomerd"], x: float, y: float) -> float:
    """Perform mathematical operations."""
    match op:
        case "sum":
            return x + y
        case "sub":
            return x - y
        case "mul":
            return x * y
        case "div":
            return x/y
        case _:
            raise ToolException(f"Operation {op} not recognized.")

model = ChatBedrockConverse(model_id="anthropic.claude-3-haiku-20240307-v1:0", client=client)

prompt = ChatPromptTemplate.from_messages([
("system", """Answer the user's request using relevant tools (if they are available). Before calling a tool, do some analysis within <thinking></thinking> tags. First, think about which of the provided tools is the relevant tool to answer the user's request. Second, go through each of the required parameters of the relevant tool and determine if the user has directly provided or given enough information to infer a value. When deciding if the parameter can be inferred, carefully consider all the context to see if it supports a specific value. If all of the required parameters are present or can be reasonably inferred, close the thinking tag and proceed with the tool call. BUT, if one of the values for a required parameter is missing, DO NOT invoke the function (not even with fillers for the missing params) and instead, ask the user to provide the missing parameters. DO NOT ask for more information on optional parameters if it is not provided."""),
("placeholder", "{messages}")])

chain = {"messages": RunnablePassthrough()} | prompt | model.bind_tools([math_tool])
msg = chain.invoke([HumanMessage(content="How much is 3*3 ?")])
msg.content

The content field of the AIMessage returned is:

[{'type': 'text',
  'text': '<thinking>\nThe relevant tool for this request is the "math_tool" which can perform various mathematical operations. The required parameters for this tool are "op" (the operation to perform), "x" (the first operand), and "y" (the second operand).\n\nThe user has requested to find the result of 3 * 3, which corresponds to the "mul" (multiplication) operation with "x" = 3 and "y" = 3. All the required parameters are provided, so I can proceed with invoking the math_tool.\n</thinking>'},
  {'type': 'tool_use',
   'name': 'math_tool',
   'input': {'op': 'mul', 'x': 3, 'y': 3},
   'id': 'tooluse_DOvtVdkMSECOFKJ5usGsgA'}]

A list containing both the text block and the tool call.

But the tool_call field of the AIMessage class is already populate with that tool information:

msg.tool_calls
 [{'name': 'math_tool',
   'args': {'op': 'mul', 'x': 3, 'y': 3},
   'id': 'tooluse_DOvtVdkMSECOFKJ5usGsgA'}]

Is that the expected behavior ? Isn't it redundant to return tool calling information both in the content and tool_calls fields ?

3coins commented 4 months ago

@thiagotps The text response in your example seems to be a symptom of the prompt you are providing, where you have instructed the model to call the tool; the model seems to infer the function here and actually doing the calculations. For a usual tool call, this should be done by the client, and results passed back to the model. I don't see how this is an issue with the ChatBedrockConverse class. Have you tried running the same prompt with tool calling directly using Bedrock API and compared the results?

3coins commented 4 months ago

I modified your code slightly to remove the system instructions which only returned the tool_call response.

prompt = ChatPromptTemplate.from_messages([("placeholder", "{messages}")])

Output

[{'type': 'tool_use',
  'name': 'math_tool',
  'input': {'op': 'mul', 'x': 3, 'y': 3},
  'id': 'tooluse_QYT9YJl1TO6ewv09Xzh_ZA'}]
thiagotps commented 4 months ago

@3coins In your example, the tool_call response is still being passed in the content field. What I imagine would be the correct behavior in this situation, is the content to be a empty string, since the model didn't return any text.

As an example, using the ChatBedrock class:

from langchain_core.prompts import ChatPromptTemplate
from langchain_aws import  ChatBedrock
from langchain_core.runnables import RunnableAssign,RunnablePassthrough
from langchain_core.messages import HumanMessage, AIMessage
from langchain_core.output_parsers import StrOutputParser
import uuid
from langchain_core.tools import tool
from typing import Annotated, Literal

@tool
def math_tool(op: Annotated[Literal["sum" , "sub" , "mul" , "div"], "The math operation to be perfomerd"], x: float, y: float) -> float:
    """Perform mathematical operations."""
    match op:
        case "sum":
            return x + y
        case "sub":
            return x - y
        case "mul":
            return x * y
        case "div":
            return x/y
        case _:
            raise ToolException(f"Operation {op} not recognized.")

model = ChatBedrock(model_id="anthropic.claude-3-haiku-20240307-v1:0", client=client)

prompt = ChatPromptTemplate.from_messages([("placeholder", "{messages}")])

chain = {"messages": RunnablePassthrough()} | prompt | model.bind_tools([math_tool])
msg = chain.invoke([HumanMessage(content="How much is 3*3 ?")])
msg

The output is:

AIMessage(content='', additional_kwargs={'usage': {'prompt_tokens': 373, 'completion_tokens': 106, 'total_tokens': 479}, 'stop_reason': 'tool_use', 'model_id': 'anthropic.claude-3-haiku-20240307-v1:0'}, response_metadata={'usage': {'prompt_tokens': 373, 'completion_tokens': 106, 'total_tokens': 479}, 'stop_reason': 'tool_use', 'model_id': 'anthropic.claude-3-haiku-20240307-v1:0'}, id='run-6563b53d-519f-4473-9e66-5f8099114ef8-0', tool_calls=[{'name': 'math_tool', 'args': {'op': 'mul', 'x': 3, 'y': 3}, 'id': 'toolu_bdrk_01RX1M7qCoRzQ4J7ZS2YBEYE'}], usage_metadata={'input_tokens': 373, 'output_tokens': 106, 'total_tokens': 479})

The too_call information is only passed in the tool_calls list of the AIMessage, its content variable is only a empty string.

thiagotps commented 4 months ago

@3coins I think the issue is in line 571 of bedrock_converse.py.

def _parse_response(response: Dict[str, Any]) -> AIMessage:
    anthropic_content = _bedrock_to_anthropic(
        response.pop("output")["message"]["content"]
    )
    tool_calls = _extract_tool_calls(anthropic_content)
    usage = UsageMetadata(_camel_to_snake_keys(response.pop("usage")))  # type: ignore[misc]
    return AIMessage(
        content=_str_if_single_text_block(anthropic_content),  # type: ignore[arg-type]
        usage_metadata=usage,
        response_metadata=response,
        tool_calls=tool_calls,
    )

All of anthropic_content is being passed to the content of AIMessage, despite the tool_calls being extracted above and being passed separately in the tool_calls field of AIMessage.

thiagotps commented 4 months ago

I created a pull request #100 for this.

3coins commented 4 months ago

@thiagotps Looking at the Anthropic documentation, it seems like content block along with the tool_use info is normal in case a user want to look at the chain of thought reasoning that the model used to decide which tool to use. https://docs.anthropic.com/en/docs/build-with-claude/tool-use#chain-of-thought

I also looked at the ChatAnthropic and it seems to also add the content as-is. https://github.com/langchain-ai/langchain/blob/master/libs/partners/anthropic/langchain_anthropic/chat_models.py#L716-L741

Let me know if I am missing anything.