Unable to get streaming response from BedrockChat in a custom agent

MrHertal commented 3 months ago

Checked other resources

[X] I added a very descriptive title to this issue.
[X] I searched the LangChain.js documentation with the integrated search.
[X] I used the GitHub search to find a similar question and didn't find it.
[X] I am sure that this is a bug in LangChain.js rather than my code.
[X] The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

  const llm = new BedrockChat({
    model: "anthropic.claude-3-sonnet-20240229-v1:0",
    modelKwargs: {
      temperature: 0,
    },
    streaming: true,
    callbacks: [
      {
        async handleLLMNewToken(token) {
          console.log({ token });
        },
      },
    ],
  });

  const memory = new BufferMemory({
    chatHistory: new DynamoDBChatMessageHistory({
      tableName: memoryTable,
      partitionKey: "SessionId",
      sessionId,
    }),
    returnMessages: true,
    memoryKey: "chat_history",
    outputKey: "output",
  });

  const agent = createToolCallingAgent({
    llm,
    tools,
    prompt,
  });

  const agentExecutor = new AgentExecutor({
    agent,
    tools,
    memory,
  });

  const result = await agentExecutor.invoke({
    input: message,
  });

Error Message and Stack Trace (if applicable)

No response

Description

Hi,

I'm following this example: https://github.com/langchain-ai/langchainjs/blob/main/examples/src/agents/streaming.ts to enable streaming in an agent response.

I'm using BedrockChat as the model with the following options:

  const llm = new BedrockChat({
    model: "anthropic.claude-3-sonnet-20240229-v1:0",
    modelKwargs: {
      temperature: 0,
    },
    streaming: true,
    callbacks: [
      {
        async handleLLMNewToken(token) {
          console.log({ token });
        },
      },
    ],
  });

I'm expecting to see a response from the agent streaming in the console, token by token. But i always get a single token with the full response:

{
  token: "the full response..."
}

Am I missing something?

System Info

{
  name: 'langchain',
  description: 'Typescript bindings for langchain',
  'dist-tags': {
    next: '0.2.3-rc.0',
    latest: '0.2.7'
  },
  version: '0.2.7',
  ...
}

yarn -v
1.22.22
node -v
v20.11.0

dosubot[bot] commented 3 months ago

Hey there, @MrHertal! I'm here to help you with the issue you've encountered. I'm a bot designed to assist with bug-solving, answering questions, and guiding potential contributors. Let's work together to resolve this issue!

You are unable to get a streaming response from BedrockChat in a custom agent and instead receive a single token with the full response because the streaming option is not enabled in the BedrockChat configuration for the custom agent.

To enable streaming, you need to set the streaming property to true and provide the appropriate callbacks for handling new tokens and the end of the response. Here is an example of how to configure it:

import { BedrockChat } from "@langchain/community/chat_models/bedrock";
import { HumanMessage } from "@langchain/core/messages";

async function runStreaming() {
  // Instantiate the BedrockChat model with the desired configuration
  const model = new BedrockChat({
    model: "anthropic.claude-3-sonnet-20240229-v1:0",
    region: "us-east-1",
    credentials: {
      accessKeyId: process.env.BEDROCK_AWS_ACCESS_KEY_ID!,
      secretAccessKey: process.env.BEDROCK_AWS_SECRET_ACCESS_KEY!,
    },
    maxTokens: 150,
    temperature: 0.7,
    stopSequences: ["\n", " Human:", " Assistant:"],
    streaming: true, // Enable streaming
    trace: "ENABLED",
    guardrailIdentifier: "your-guardrail-id",
    guardrailVersion: "1.0",
    guardrailConfig: {
      tagSuffix: "example",
      streamProcessingMode: "SYNCHRONOUS",
    },
  });

  // Prepare the message to be sent to the model
  const message = new HumanMessage({ content: "Tell me a joke" });

  // Stream the response from the model
  const stream = await model.stream([message]);
  for await (const chunk of stream) {
    // Output each chunk of the response
    console.log(chunk.content);
  }
}

runStreaming().catch(console.error);

This configuration ensures that the BedrockChat instance is set up to handle streaming responses, and the stream method will yield each chunk of the response as it is received ^[1]^[2]^[3].

To continue talking to Dosu, mention @dosu.

MrHertal commented 3 months ago

Sorry @dosu but this is not helping.

I'm using BedrockChat in the context of a custom agent, so I don't run the model directly. This is done by the agent after executor.invoke

dosubot[bot] commented 3 months ago

To enable proper token-by-token streaming responses from BedrockChat in your custom agent setup, you need to ensure that the streamRunnable parameter is set to true in the createToolCallingAgent function. Here is an example of how to configure it:

import { BedrockChat } from "@langchain/community/chat_models/bedrock";
import { BufferMemory } from "@langchain/memory";
import { DynamoDBChatMessageHistory } from "@langchain/memory/dynamodb";
import { createToolCallingAgent, AgentExecutor } from "@langchain/agents";
import { HumanMessage } from "@langchain/core/messages";

const llm = new BedrockChat({
  model: "anthropic.claude-3-sonnet-20240229-v1:0",
  modelKwargs: {
    temperature: 0,
  },
  streaming: true,
  callbacks: [
    {
      async handleLLMNewToken(token) {
        console.log({ token });
      },
    },
  ],
});

const memory = new BufferMemory({
  chatHistory: new DynamoDBChatMessageHistory({
    tableName: memoryTable,
    partitionKey: "SessionId",
    sessionId,
  }),
  returnMessages: true,
  memoryKey: "chat_history",
  outputKey: "output",
});

const agent = createToolCallingAgent({
  llm,
  tools,
  prompt,
  streamRunnable: true, // Enable streaming
});

const agentExecutor = new AgentExecutor({
  agent,
  tools,
  memory,
});

const result = await agentExecutor.invoke({
  input: message,
});

By setting streamRunnable: true, you enable the agent to invoke the underlying model in streaming mode, allowing for token-by-token streaming responses ^[1]^[2].

To continue talking to Dosu, mention @dosu.

MrHertal commented 3 months ago

By default streamRunnable is set to true I tried to explicitly set it to true without success

jeasonnow commented 2 months ago

By default streamRunnable is set to true I tried to explicitly set it to true without success

Can you try the following code?

const result = await agentExecutor.stream({
  input: message,
});

@MrHertal

MrHertal commented 2 months ago

@jeasonnow thanks but it was not working.

I finally made it work by implementing this solution: https://github.com/aws-samples/langchain-agents/tree/main/bedrock/langchain-js-stream-agent

langchain-ai / langchainjs