Open gokuljs opened 1 month ago
Hey there, @gokuljs! I'm here to help you with any issues you're experiencing while using LangChain with ChatOpenAI. Let's work together to solve this bug and improve your experience!
To prevent function call information from being printed during real-time streaming when using LangChain with the ChatOpenAI model, you can avoid using console.log
statements within the callback functions and configure the model to not include token usage in the stream. Here is an example of how you can set up the ChatOpenAI
model for streaming without printing function call information:
import { ChatOpenAI } from "@langchain/openai";
import { CallbackManager } from "@langchain/core/callbacks/manager";
const model = new ChatOpenAI({
modelName: "gpt-4o",
streaming: true,
streamUsage: false, // Disable streaming of token usage
callbacks: [
{
async handleLLMNewToken(token: string) {
// Handle the new token without printing it
},
},
],
});
const messages = [
new SystemMessage(getSystemPrompt({ userId, companyId, inputMetadata })),
...(currentMessageHistory.length > 0
? currentMessageHistory
: [new HumanMessage(inputMessage)]),
];
const modelResponse = await model.stream(messages, {
signal: modelController.signal,
});
let { llmResponse, aiMessage } = await processLlmStream(
modelResponse,
socket,
id,
messageId,
teambledb
);
async function processLlmStream(stream, socket, id, messageId, teambledb) {
let llmResponse = "";
let aiMessage = undefined;
try {
for await (const chunk of stream) {
if (!chunk.content && chunk.function_call) {
// Ignore function calls
continue;
} else if (chunk.content) {
llmResponse += chunk.content;
socket.emit("ai_response_chunk", { chatId: id, messageId, chunk: chunk.content });
}
}
} catch (error) {
console.log("Error during LLM stream processing:", error);
} finally {
return { llmResponse, aiMessage };
}
}
In this setup, the handleLLMNewToken
callback function is defined but does not include any console.log
statements, thus preventing any function call information from being printed during real-time streaming. Additionally, the streamUsage
option is set to false
to ensure that token usage information is not included in the stream [1].
To continue talking to Dosu, mention @dosu.
@dosu give an example
Sure! Here's an example of how you can set up the ChatOpenAI
model for streaming without printing function call information:
import { ChatOpenAI } from "@langchain/openai";
import { SystemMessage, HumanMessage } from "@langchain/core/messages";
const llm = new ChatOpenAI({
modelName: "gpt-4o",
temperature: 0,
streaming: true,
streamUsage: false, // Disable streaming of token usage
});
const messages = [
new SystemMessage(getSystemPrompt({ userId, companyId, inputMetadata })),
...(currentMessageHistory.length > 0
? currentMessageHistory
: [new HumanMessage(inputMessage)]),
];
const modelResponse = await llm.stream(messages, {
signal: modelController.signal,
});
async function processLlmStream(stream, socket, id, messageId, teambledb) {
let llmResponse = "";
let aiMessage = undefined;
try {
for await (const chunk of stream) {
if (!chunk.content && chunk.function_call) {
// Ignore function calls
continue;
} else if (chunk.content) {
llmResponse += chunk.content;
socket.emit("ai_response_chunk", { chatId: id, messageId, chunk: chunk.content });
}
}
} catch (error) {
console.log("Error during LLM stream processing:", error);
} finally {
return { llmResponse, aiMessage };
}
}
let { llmResponse, aiMessage } = await processLlmStream(
modelResponse,
socket,
id,
messageId,
teambledb
);
In this example, the streamUsage
option is set to false
to ensure that token usage information is not included in the stream. The processLlmStream
function processes the stream and ignores any chunks that contain function call information.
To continue talking to Dosu, mention @dosu.
again same issue @dosu
To prevent function call information from being printed during real-time streaming when using LangChain with ChatOpenAI, you can modify the callback handling to exclude or filter out the function call information. Here is an example of how you can achieve this:
import { ChatOpenAI } from "@langchain/openai";
import { SystemMessage, HumanMessage } from "@langchain/core/messages";
const model = new ChatOpenAI({
modelName: "gpt-4o",
streaming: true,
callbacks: [
{
async handleLLMNewToken(token: string) {
// Filter out function call information
if (!token.includes("getPerformanceReviews") && !token.includes("getUserFeedback")) {
console.log(token); // Process the token as needed
}
},
},
],
});
const messages = [
new SystemMessage(getSystemPrompt({ userId, companyId, inputMetadata })),
...(currentMessageHistory.length > 0
? currentMessageHistory
: [new HumanMessage(inputMessage)]),
];
const modelResponse = await model.stream(messages, {
signal: modelController.signal,
});
async function processLlmStream(stream, socket, id, messageId, teambledb) {
let llmResponse = "";
let aiMessage = undefined;
try {
for await (const chunk of stream) {
if (!chunk.content && chunk.function_call) {
// Ignore function calls
continue;
} else if (chunk.content) {
llmResponse += chunk.content;
socket.emit("ai_response_chunk", { chatId: id, messageId, chunk: chunk.content });
}
}
} catch (error) {
console.log("Error during LLM stream processing:", error);
} finally {
return { llmResponse, aiMessage };
}
}
let { llmResponse, aiMessage } = await processLlmStream(
modelResponse,
socket,
id,
messageId,
teambledb
);
In this example, the handleLLMNewToken
callback filters out tokens that include the function call names getPerformanceReviews
and getUserFeedback
. You can adjust the filtering logic as needed to suit your specific requirements [1].
To continue talking to Dosu, mention @dosu.
can this behaviour been happening because of stream usage being true ? @dosu
Yes, the behavior of function call information being printed during real-time streaming can be caused by the streamUsage
option being set to true. When streamUsage
is true, usage metadata (such as input tokens, output tokens, and total tokens) is included in the response, which can result in function call information being printed [1][2].
To prevent this, you can set streamUsage
to false:
const llm = new ChatOpenAI({
modelName: "gpt-4o",
temperature: 0,
streaming: true,
streamUsage: false, // Disable streaming of token usage
});
Additionally, ensure that your callback handling logic filters out function call information as needed.
To continue talking to Dosu, mention @dosu.
Checked other resources
Example Code
const llm = new ChatOpenAI({ model: "gpt-4o", temperature: 0, streaming: true, streamUsage: true, });
const llmWithTools = llm.bindTools(tools); const messages = [ new SystemMessage(getSystemPrompt({ userId, companyId, inputMetadata })), ...(currentMessageHistory.length > 0 ? currentMessageHistory : [new HumanMessage(inputMessage)]), ];
const modelResponse = await llmWithTools.stream(messages, { signal: modelController.signal, });
let { llmResponse, aiMessage } = await processLlmStream( modelResponse, socket, id, messageId, teambledb );
async function processLlmStream(stream, socket, id, messageId, teambledb) { let llmResponse = ""; let aiMessage = undefined; try { for await (const chunk of stream) { if (!chunk.content && chunk.function_call) { // Ignore function calls continue; } else if (chunk.content) { llmResponse += chunk.content; socket.emit("ai_response_chunk", { chatId: id, messageId, chunk: chunk.content }); } } } catch (error) { console.log("Error during LLM stream processing:", error); } finally { return { llmResponse, aiMessage }; } }
Error Message and Stack Trace (if applicable)
No response
Description
Description:
I am using LangChain with the ChatOpenAI model and streaming output in real-time. However, occasionally the output includes function call information, as shown below:
would like to prevent function call information from being printed during real-time streaming. Is there a way to handle this? Any suggestions or advice on how to fix this issue would be appreciated.
Steps to Reproduce:
Use LangChain with ChatOpenAI. Stream the output in real-time. Observe that sometimes function call details are printed in the stream. Expected Behavior:
The output should only show the relevant text without printing the function call information.
Current Behavior:
Function call information is sometimes included in the real-time streamed output.
System Info
"@langchain/community": "^0.2.22", "@langchain/core": "^0.2.20", "@langchain/langgraph": "^0.0.31", "@langchain/openai": "^0.2.5", "langchain": "^0.2.12"