Function call information printed in streaming output when using LangChain with ChatOpenAI

gokuljs commented 1 month ago

Checked other resources

[X] I added a very descriptive title to this issue.
[X] I searched the LangChain.js documentation with the integrated search.
[X] I used the GitHub search to find a similar question and didn't find it.
[X] I am sure that this is a bug in LangChain.js rather than my code.
[X] The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

const llm = new ChatOpenAI({ model: "gpt-4o", temperature: 0, streaming: true, streamUsage: true, });

const llmWithTools = llm.bindTools(tools); const messages = [ new SystemMessage(getSystemPrompt({ userId, companyId, inputMetadata })), ...(currentMessageHistory.length > 0 ? currentMessageHistory : [new HumanMessage(inputMessage)]), ];

const modelResponse = await llmWithTools.stream(messages, { signal: modelController.signal, });

let { llmResponse, aiMessage } = await processLlmStream( modelResponse, socket, id, messageId, teambledb );

async function processLlmStream(stream, socket, id, messageId, teambledb) { let llmResponse = ""; let aiMessage = undefined; try { for await (const chunk of stream) { if (!chunk.content && chunk.function_call) { // Ignore function calls continue; } else if (chunk.content) { llmResponse += chunk.content; socket.emit("ai_response_chunk", { chatId: id, messageId, chunk: chunk.content }); } } } catch (error) { console.log("Error during LLM stream processing:", error); } finally { return { llmResponse, aiMessage }; } }

Error Message and Stack Trace (if applicable)

No response

Description

Description:

I am using LangChain with the ChatOpenAI model and streaming output in real-time. However, occasionally the output includes function call information, as shown below:

would like to prevent function call information from being printed during real-time streaming. Is there a way to handle this? Any suggestions or advice on how to fix this issue would be appreciated.

Steps to Reproduce:

Use LangChain with ChatOpenAI. Stream the output in real-time. Observe that sometimes function call details are printed in the stream. Expected Behavior:

The output should only show the relevant text without printing the function call information.

Current Behavior:

Function call information is sometimes included in the real-time streamed output.

System Info

"@langchain/community": "^0.2.22", "@langchain/core": "^0.2.20", "@langchain/langgraph": "^0.0.31", "@langchain/openai": "^0.2.5", "langchain": "^0.2.12"

dosubot[bot] commented 1 month ago

Hey there, @gokuljs! I'm here to help you with any issues you're experiencing while using LangChain with ChatOpenAI. Let's work together to solve this bug and improve your experience!

To prevent function call information from being printed during real-time streaming when using LangChain with the ChatOpenAI model, you can avoid using console.log statements within the callback functions and configure the model to not include token usage in the stream. Here is an example of how you can set up the ChatOpenAI model for streaming without printing function call information:

import { ChatOpenAI } from "@langchain/openai";
import { CallbackManager } from "@langchain/core/callbacks/manager";

const model = new ChatOpenAI({
  modelName: "gpt-4o",
  streaming: true,
  streamUsage: false, // Disable streaming of token usage
  callbacks: [
    {
      async handleLLMNewToken(token: string) {
        // Handle the new token without printing it
      },
    },
  ],
});

const messages = [
  new SystemMessage(getSystemPrompt({ userId, companyId, inputMetadata })),
  ...(currentMessageHistory.length > 0
    ? currentMessageHistory
    : [new HumanMessage(inputMessage)]),
];

const modelResponse = await model.stream(messages, {
  signal: modelController.signal,
});

let { llmResponse, aiMessage } = await processLlmStream(
  modelResponse,
  socket,
  id,
  messageId,
  teambledb
);

async function processLlmStream(stream, socket, id, messageId, teambledb) {
  let llmResponse = "";
  let aiMessage = undefined;
  try {
    for await (const chunk of stream) {
      if (!chunk.content && chunk.function_call) {
        // Ignore function calls
        continue;
      } else if (chunk.content) {
        llmResponse += chunk.content;
        socket.emit("ai_response_chunk", { chatId: id, messageId, chunk: chunk.content });
      }
    }
  } catch (error) {
    console.log("Error during LLM stream processing:", error);
  } finally {
    return { llmResponse, aiMessage };
  }
}

In this setup, the handleLLMNewToken callback function is defined but does not include any console.log statements, thus preventing any function call information from being printed during real-time streaming. Additionally, the streamUsage option is set to false to ensure that token usage information is not included in the stream ^[1].

To continue talking to Dosu, mention @dosu.

gokuljs commented 1 month ago

@dosu give an example

dosubot[bot] commented 1 month ago

Sure! Here's an example of how you can set up the ChatOpenAI model for streaming without printing function call information:

import { ChatOpenAI } from "@langchain/openai";
import { SystemMessage, HumanMessage } from "@langchain/core/messages";

const llm = new ChatOpenAI({
  modelName: "gpt-4o",
  temperature: 0,
  streaming: true,
  streamUsage: false, // Disable streaming of token usage
});

const messages = [
  new SystemMessage(getSystemPrompt({ userId, companyId, inputMetadata })),
  ...(currentMessageHistory.length > 0
    ? currentMessageHistory
    : [new HumanMessage(inputMessage)]),
];

const modelResponse = await llm.stream(messages, {
  signal: modelController.signal,
});

async function processLlmStream(stream, socket, id, messageId, teambledb) {
  let llmResponse = "";
  let aiMessage = undefined;
  try {
    for await (const chunk of stream) {
      if (!chunk.content && chunk.function_call) {
        // Ignore function calls
        continue;
      } else if (chunk.content) {
        llmResponse += chunk.content;
        socket.emit("ai_response_chunk", { chatId: id, messageId, chunk: chunk.content });
      }
    }
  } catch (error) {
    console.log("Error during LLM stream processing:", error);
  } finally {
    return { llmResponse, aiMessage };
  }
}

let { llmResponse, aiMessage } = await processLlmStream(
  modelResponse,
  socket,
  id,
  messageId,
  teambledb
);

In this example, the streamUsage option is set to false to ensure that token usage information is not included in the stream. The processLlmStream function processes the stream and ignores any chunks that contain function call information.

To continue talking to Dosu, mention @dosu.

gokuljs commented 1 month ago

again same issue @dosu

dosubot[bot] commented 1 month ago

To prevent function call information from being printed during real-time streaming when using LangChain with ChatOpenAI, you can modify the callback handling to exclude or filter out the function call information. Here is an example of how you can achieve this:

import { ChatOpenAI } from "@langchain/openai";
import { SystemMessage, HumanMessage } from "@langchain/core/messages";

const model = new ChatOpenAI({
  modelName: "gpt-4o",
  streaming: true,
  callbacks: [
    {
      async handleLLMNewToken(token: string) {
        // Filter out function call information
        if (!token.includes("getPerformanceReviews") && !token.includes("getUserFeedback")) {
          console.log(token); // Process the token as needed
        }
      },
    },
  ],
});

const messages = [
  new SystemMessage(getSystemPrompt({ userId, companyId, inputMetadata })),
  ...(currentMessageHistory.length > 0
    ? currentMessageHistory
    : [new HumanMessage(inputMessage)]),
];

const modelResponse = await model.stream(messages, {
  signal: modelController.signal,
});

async function processLlmStream(stream, socket, id, messageId, teambledb) {
  let llmResponse = "";
  let aiMessage = undefined;
  try {
    for await (const chunk of stream) {
      if (!chunk.content && chunk.function_call) {
        // Ignore function calls
        continue;
      } else if (chunk.content) {
        llmResponse += chunk.content;
        socket.emit("ai_response_chunk", { chatId: id, messageId, chunk: chunk.content });
      }
    }
  } catch (error) {
    console.log("Error during LLM stream processing:", error);
  } finally {
    return { llmResponse, aiMessage };
  }
}

let { llmResponse, aiMessage } = await processLlmStream(
  modelResponse,
  socket,
  id,
  messageId,
  teambledb
);

In this example, the handleLLMNewToken callback filters out tokens that include the function call names getPerformanceReviews and getUserFeedback. You can adjust the filtering logic as needed to suit your specific requirements ^[1].

To continue talking to Dosu, mention @dosu.

gokuljs commented 1 month ago

can this behaviour been happening because of stream usage being true ? @dosu

dosubot[bot] commented 1 month ago

Yes, the behavior of function call information being printed during real-time streaming can be caused by the streamUsage option being set to true. When streamUsage is true, usage metadata (such as input tokens, output tokens, and total tokens) is included in the response, which can result in function call information being printed ^[1]^[2].

To prevent this, you can set streamUsage to false:

const llm = new ChatOpenAI({
  modelName: "gpt-4o",
  temperature: 0,
  streaming: true,
  streamUsage: false, // Disable streaming of token usage
});

Additionally, ensure that your callback handling logic filters out function call information as needed.

To continue talking to Dosu, mention @dosu.

langchain-ai / langchainjs