langchain-ai / langchainjs

🦜🔗 Build context-aware reasoning applications 🦜🔗
https://js.langchain.com/docs/
MIT License
12.52k stars 2.14k forks source link

ChatModel forced to stream if called in streaming context regardless of config or support #6946

Open airhorns opened 2 weeks ago

airhorns commented 2 weeks ago

Checked other resources

Example Code

export const model = new ChatGroq({
  model: "mixtral-8x7b-32768",
  streaming: false
});

// works fine outside of a LangGraph or LangChain streaming context, but fails if run inside an outer `.streamEvents` call
await model.bind({ response_format: { type: "json_object" } }).invoke([new HumanMessage("generate some example JSON")]);

Error Message and Stack Trace (if applicable)

400 {"error":{"message":"response_format` does not support streaming","type":"invalid_request_error"}} Error: 400 {"error":{"message":"response_format` does not support streaming","type":"invalid_request_error"}} at Function.generate (/Users/airhorns/Code/gadget/node_modules/.pnpm/groq-sdk@0.5.0/node_modules/groq-sdk/src/error.ts:58:14) at Groq.makeStatusError (/Users/airhorns/Code/gadget/node_modules/.pnpm/groq-sdk@0.5.0/node_modules/groq-sdk/src/core.ts:397:21) at Groq.makeRequest (/Users/airhorns/Code/gadget/node_modules/.pnpm/groq-sdk@0.5.0/node_modules/groq-sdk/src/core.ts:460:24) at processTicksAndRejections (node:internal/process/task_queues:95:5) at async RetryOperation._fn (/Users/airhorns/Code/gadget/node_modules/.pnpm/p-retry@4.5.0/node_modules/p-retry/index.js:50:12)

Description

I think its not safe to just assume that the model can stream if the ambient context wants it to stream -- I think if specific params are passed to the invocation that tell it not to stream, it shouldn't. Without this ability to override, I can't use Groq's json support at all from within LangGraph or a bigger LangChain that is streaming tokens from other models!

System Info

airhorns@hugs ~/C/gadget (genny-groq)> pnpm info langchain

langchain@0.3.2 | MIT | deps: 12 | versions: 297
Typescript bindings for langchain
https://github.com/langchain-ai/langchainjs/tree/main/langchain/

keywords: llm, ai, gpt3, chain, prompt, prompt engineering, chatgpt, machine learning, ml, openai, embeddings, vectorstores

dist
.tarball: https://registry.npmjs.org/langchain/-/langchain-0.3.2.tgz
.shasum: aec3e679d3d6c36f469448380affa475c92fbd86
.integrity: sha512-kd2kz1cS/PIVrLEDFlrZsAasQfPLbY1UqCZbRKa3/QcpB33/n6xPDvXSMfBuKhvNj0bjW6MXDR9HZTduXjJBgg==
.unpackedSize: 2.9 MB

dependencies:
@langchain/openai: >=0.1.0 <0.4.0        jsonpointer: ^5.0.1                      uuid: ^10.0.0
@langchain/textsplitters: >=0.0.0 <0.2.0 langsmith: ^0.1.56-rc.1                  yaml: ^2.2.1
js-tiktoken: ^1.0.12                     openapi-types: ^12.1.3                   zod-to-json-schema: ^3.22.3
js-yaml: ^4.1.0                          p-retry: 4                               zod: ^3.22.4

maintainers:
- nfcampos <nuno@boringbits.io>
- jacoblee93 <jacoblee93@gmail.com>
- andrewnguonly <andrewnguonly@gmail.com>
- davidduong <david@duong.cz>
- hwchase17 <hw.chase.17@gmail.com>
- basproul <braceasproul@gmail.com>

dist-tags:
latest: 0.3.2                              next: 0.3.2-rc.0                           tag-for-publishing-older-releases: 0.2.20
dosubot[bot] commented 2 weeks ago

To address the issue where the ChatGroq model defaults to streaming in a LangChain streaming context, you need to ensure that the streaming property is explicitly set to false in the model configuration. This should override the default behavior even in a streaming context.

Here's how you can configure it:

import { ChatGroq } from '@langchain/groq';

const model = new ChatGroq({
  model: "mixtral-8x7b-32768",
  streaming: false, // Explicitly disable streaming
});

// Bind the response format to JSON
await model.bind({ response_format: { type: "json_object" } }).invoke([new HumanMessage("generate some example JSON")]);

The issue arises because the LangChain context might force streaming due to its internal logic, which checks for streaming conditions like signals or timeouts [1]. By setting streaming: false, you should be able to prevent this behavior and use the JSON response format without encountering errors.

To continue talking to Dosu, mention @dosu.

andrewdoro commented 1 week ago

having the same issue

kodychik commented 1 week ago

If no one else is working on this right now, I'd like to take a look at this issue. Thanks