anthropic: `input_json_delta` tokens should not be passed to `handleLLMNewToken`

danielleontiev commented 2 months ago

Checked other resources

[X] I added a very descriptive title to this issue.
[X] I searched the LangChain.js documentation with the integrated search.
[X] I used the GitHub search to find a similar question and didn't find it.
[X] I am sure that this is a bug in LangChain.js rather than my code.
[X] The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

Introduced here: https://github.com/langchain-ai/langchainjs/pull/6179#issuecomment-2245995402, the Anthropic chat model with the tools would pass input_json_delta tokens to handleLLMNewToken resulting in the unwanted tokens being propagated to the callbacks.

import { ChatAnthropic } from "@langchain/anthropic";
import { HumanMessage } from "@langchain/core/messages";
import { tool } from "@langchain/core/tools";
import { z } from "zod";

const weatherTool = tool(({ location }) => "The weather in " + location + " is sunny", {
  name: "get_current_weather",
  description: "Get the current weather for a location",
  schema: z.object({
    location: z.string(),
  }),
});

const newTokenCallback = { handleLLMNewToken(token: string) { console.log("New token: " + token) } };

const chat = new ChatAnthropic({ model: "claude-3-5-sonnet-20240620", streaming: true });
const chatWithTools = chat.bindTools([weatherTool]);

const messages = [
  new HumanMessage("What's the weather in Paris?"),
];

console.log(await chatWithTools.invoke(messages, { callbacks: [newTokenCallback] }));

Output

``` New token: Certainly New token: ! New token: I New token: can help New token: you fin New token: d out New token: the current weather in New token: Paris. New token: To New token: get this New token: information, I'll New token: use New token: the New token: get New token: _current_weather function New token: . New token: Let New token: me fetch New token: that New token: data New token: for New token: you. New token: {"location": New token: "P New token: aris"} AIMessageChunk { "id": "msg_01DdYShYboNJmRuUboYtFkJo", "content": [ { "index": 0, "type": "text", "text": "Certainly! I can help you find out the current weather in Paris. To get this information, I'll use the get_current_weathe r function. Let me fetch that data for you." }, { "index": 1, "type": "tool_use", "id": "toolu_013m574E44V7kBZtyEzWFgFv", "name": "get_current_weather", "input": "{\"location\": \"Paris\"}" } ], "additional_kwargs": { "id": "msg_01DdYShYboNJmRuUboYtFkJo", "type": "message", "role": "assistant", "model": "claude-3-5-sonnet-20240620", "stop_reason": "tool_use", "stop_sequence": null }, "response_metadata": {}, "tool_calls": [ { "name": "get_current_weather", "args": { "location": "Paris" }, "id": "toolu_013m574E44V7kBZtyEzWFgFv", "type": "tool_call" } ], "tool_call_chunks": [ { "args": "{\"location\": \"Paris\"}", "id": "toolu_013m574E44V7kBZtyEzWFgFv", "name": "get_current_weather", "index": 1, "type": "tool_call_chunk" } ], "invalid_tool_calls": [], "usage_metadata": { "input_tokens": 392, "output_tokens": 95, "total_tokens": 487 } } ```

The last three tokens are wrong because they are part of Anthropic's input_json_delta:

New token: {"location":
New token:  "P
New token: aris"}

For example, OpenAI behaves differently

import { ChatOpenAI } from "@langchain/openai";

const chat = new ChatOpenAI({ model: "gpt-4o", streaming: true });

Output

``` New token: New token: New token: New token: New token: New token: New token: AIMessageChunk { "id": "chatcmpl-AEQLZY0XohZRecOzovcUdPxDQUNZg", "content": "", "additional_kwargs": { "tool_calls": [ { "index": 0, "id": "call_vHoeJADcXYCVHtbo0XkAU2cB", "type": "function", "function": "[Object]" } ] }, "response_metadata": { "estimatedTokenUsage": { "promptTokens": 13, "completionTokens": 0, "totalTokens": 13 }, "prompt": 0, "completion": 0, "finish_reason": "tool_calls", "system_fingerprint": "fp_e5e4913e83", "usage": { "prompt_tokens": 52, "completion_tokens": 15, "total_tokens": 67, "prompt_tokens_details": { "cached_tokens": 0 }, "completion_tokens_details": { "reasoning_tokens": 0 } } }, "tool_calls": [ { "name": "get_current_weather", "args": { "location": "Paris" }, "id": "call_vHoeJADcXYCVHtbo0XkAU2cB", "type": "tool_call" } ], "tool_call_chunks": [ { "name": "get_current_weather", "args": "{\"location\":\"Paris\"}", "id": "call_vHoeJADcXYCVHtbo0XkAU2cB", "index": 0, "type": "tool_call_chunk" } ], "invalid_tool_calls": [], "usage_metadata": { "input_tokens": 52, "output_tokens": 15, "total_tokens": 67 } } ```

Interestingly enough, the behavior is not reproduced if using .streamEvents on the chat model, because arguments tokens are returned as part of input_json_delta objects, but is reproduced when using .streamEvents from the LangGraph agent, because content is returned as simple string in that case.

const chat = new ChatAnthropic({ model: "claude-3-5-sonnet-20240620", streaming: true });
const chatWithTools = chat.bindTools([weatherTool]);

const messages = [
  new HumanMessage("What's the weather in Paris?"),
];

for await (const response of chatWithTools.streamEvents(messages, { version: "v2" })) {
  if (response.event === "on_chat_model_stream") {
    console.log(response.data.chunk?.content);
  }
}

Output (truncated)

``` [ { index: 0, type: 'text_delta', text: ' to' } ] [ { index: 0, type: 'text_delta', text: ' fetch' } ] [ { index: 0, type: 'text_delta', text: ' the' } ] [ { index: 0, type: 'text_delta', text: ' current' } ] [ { index: 0, type: 'text_delta', text: ' weather data.' } ] [ { index: 0, type: 'text_delta', text: ' Let' } ] [ { index: 0, type: 'text_delta', text: ' me ' } ] [ { index: 0, type: 'text_delta', text: 'do that for' } ] [ { index: 0, type: 'text_delta', text: ' you right' } ] [ { index: 0, type: 'text_delta', text: ' away' } ] [ { index: 0, type: 'text_delta', text: '.' } ] [ { index: 1, type: 'tool_use', id: 'toolu_01Ej6eX3dAskhZa3Nq2B6Ro4', name: 'get_current_weather', input: '' } ] [ { index: 1, input: '', type: 'input_json_delta' } ] [ { index: 1, input: '{"lo', type: 'input_json_delta' } ] [ { index: 1, input: 'cation": "', type: 'input_json_delta' } ] [ { index: 1, input: 'Paris"}', type: 'input_json_delta' } ] [] ```

import { createReactAgent } from "@langchain/langgraph/prebuilt";

const chat = new ChatAnthropic({ model: "claude-3-5-sonnet-20240620", streaming: true });

const messages = [
  new HumanMessage("What's the weather in Paris?"),
];

const agent = createReactAgent({
  llm: chat,
  tools: [weatherTool],
});

for await (const response of agent.streamEvents({ messages }, { version: "v2" })) {
  if (response.event === "on_chat_model_stream") {
    console.log(response.data.chunk?.content);
  }
}

Output (truncated)

``` Let me do that for you right away . {"loc ation": "Paris"} ```

I am not sure about the streaming behavior, but at least the callback behavior could be fixed by excluding input tokens here:

https://github.com/langchain-ai/langchainjs/blob/660af3eedb64bf4af98eb5d0547782b84b2ed52c/libs/langchain-anthropic/src/chat_models.ts#L167-L174

Error Message and Stack Trace (if applicable)

No response

Description

"Argumnets" tokens should not appear in the callbacks and when streaming with LangGraph

System Info

npm info langchain

``` langchain@0.3.2 | MIT | deps: 12 | versions: 297 Typescript bindings for langchain https://github.com/langchain-ai/langchainjs/tree/main/langchain/ keywords: llm, ai, gpt3, chain, prompt, prompt engineering, chatgpt, machine learning, ml, openai, embeddings, vectorstores dist .tarball: https://registry.npmjs.org/langchain/-/langchain-0.3.2.tgz .shasum: aec3e679d3d6c36f469448380affa475c92fbd86 .integrity: sha512-kd2kz1cS/PIVrLEDFlrZsAasQfPLbY1UqCZbRKa3/QcpB33/n6xPDvXSMfBuKhvNj0bjW6MXDR9HZTduXjJBgg== .unpackedSize: 2.9 MB dependencies: @langchain/openai: >=0.1.0 <0.4.0 jsonpointer: ^5.0.1 uuid: ^10.0.0 @langchain/textsplitters: >=0.0.0 <0.2.0 langsmith: ^0.1.56-rc.1 yaml: ^2.2.1 js-tiktoken: ^1.0.12 openapi-types: ^12.1.3 zod-to-json-schema: ^3.22.3 js-yaml: ^4.1.0 p-retry: 4 zod: ^3.22.4 maintainers: - nfcampos - jacoblee93 - andrewnguonly - davidduong - hwchase17 - basproul dist-tags: latest: 0.3.2 next: 0.3.2-rc.0 tag-for-publishing-older-releases: 0.2.20 published 2 weeks ago by jacoblee93 ```

MacOS
node v22.9.0
npm 10.8.3

dosubot[bot] commented 2 months ago

Hey, @danielleontiev! I'm here to assist you with any bugs, questions, or contributions you have. Let's work together to solve this issue!

To resolve the issue of input_json_delta tokens being incorrectly passed to the handleLLMNewToken callback in the LangChain.js library when using the Anthropic chat model, you can modify the _formatContent function. Specifically, ensure that input_json_delta types are converted to tool_use types, as input_json_delta is not a valid type for Anthropic messages. This conversion is already present in the code:

if (contentPartCopy.type === "input_json_delta") {
  // `input_json_delta` type only represents yielding partial tool inputs
  // and is not a valid type for Anthropic messages.
  contentPartCopy.type = "tool_use";
}

This code snippet ensures that any input_json_delta type is converted to tool_use, which is a valid type for Anthropic messages, thus resolving the issue ^[1].

To continue talking to Dosu, mention @dosu.

jacoblee93 commented 2 months ago

CC @bracesproul I think this is unintended

langchain-ai / langchainjs