vercel / ai-chatbot

A full-featured, hackable Next.js AI chatbot built by Vercel
https://chat.vercel.ai
Other
6.62k stars 2.09k forks source link

Help: how to convert LangChain streamedResponse to StreamingTextResponse (Vercel AI SDK) #103

Open anhhtca opened 1 year ago

anhhtca commented 1 year ago

Hi,

I have a backend API with the code for /api/chat as follows:

const nonStreamingModel = new ChatOpenAI({
        modelName: "gpt-3.5-turbo",
      }, configuarion)

      const streamingModel = new ChatOpenAI({
        streaming: true,
        callbacks: [
          {
            handleLLMNewToken(token) {
              streamedResponse += token;
            },
          },
        ],
      }, configuarion)

      const embedding = new OpenAIEmbeddings({}, configuarion)

      try {
        const vectorStore = await HNSWLib.load(`${baseDirectoryPath}/docs/index/data/`, embedding);

        const chain = ConversationalRetrievalQAChain.fromLLM(
          streamingModel,
          vectorStore.asRetriever(), {
          memory: new BufferMemory({
            memoryKey: "chat_history",
            returnMessages: true
          }),
          questionGeneratorChainOptions: {
            llm: nonStreamingModel,
            template: await CONDENSE_PROMPT.format({ question: userContent, chat_history }),
          },
          qaChainOptions: {
            type: "stuff",
            prompt: QA_PROMPT
          }
        }
        );

        await chain.call({ question: userContent });
        return streamedResponse

The front-end I am using is vercel/ai-chatbot. The chat page retrieves a StreamingTextResponse and binds it to the UI.

I am really unsure about how to convert the LangChain stream text for the front-end.

Thank you for your help!

joshdumoulin commented 1 year ago

I'm also stuck with something very similar to this @anhhtca. Did you manage to make any progress?

anhhtca commented 1 year ago

I'm also stuck with something very similar to this @anhhtca. Did you manage to make any progress?

I have converted it to StreamTextResponse, but the front-end is still not streaming. 😅

xleven commented 1 year ago

With latest version of vercel/ai, you can do LangChain streaming by

  1. using stream interface of LangChain expression language, e.g. prompt.pipe(model).pipe(outputParser). Example
  2. adding callbacks to your LangChain call. Example
coozywana commented 1 year ago

@xleven Any idea of changing the chat model from OpenAI to something else?

xleven commented 1 year ago

@xleven Any idea of changing the chat model from OpenAI to something else?

It should be the same as long as the chat model supports streaming.

coozywana commented 1 year ago

@xleven Any idea of changing the chat model from OpenAI to something else?

It should be the same as long as the chat model supports streaming.

The way Vercel uses LangChain is a lot different to the official LangChain documentation. I can't seem to get any other provider working other than LangChain with the example you provided before. Using LangChain, how do I know the names of the different providers?

Example: below works import { ChatOpenAI } from 'langchain/chat_models/openai';

below doesn't work import { ChatReplicate} from 'langchain/chat_models/replicate';

xleven commented 1 year ago

@coozywana You can always refer to the doc. From what I know there is no ChatReplicate under Chat Models yet while a [Replicate](https://js.langchain.com/docs/modules/model_io/models/llms/integrations/replicate) does exist under LLMs.

coozywana commented 1 year ago

@xleven Yes, thanks I found out how to solve that. Also how do I implemeent langchain with the vercel kv?

coozywana commented 1 year ago

Below is the code from the LangChain doc, it's alot different to the original one in the current repo that has the kv connected `import { StreamingTextResponse, LangChainStream } from 'ai'; import { ChatOpenAI } from 'langchain/chat_models/openai'; import { AIMessage, HumanMessage } from 'langchain/schema';

export const runtime = 'edge';

export async function POST(req: Request) { const { messages } = await req.json(); const { stream, handlers, writer } = LangChainStream();

const llm = new ChatOpenAI({ streaming: true, });

llm .call( messages.map(m => m.role == 'user' ? new HumanMessage(m.content) : new AIMessage(m.content), ), {}, [handlers], ) .catch(console.error);

return new StreamingTextResponse(stream); }`

scottklein7 commented 1 year ago

@coozywana try this


      async onCompletion(completion) {
        console.log(completion, 'compeltion here');
        const title = json.messages[0].content.substring(0, 100);
        const id = json.id ?? nanoid();
        const createdAt = Date.now();
        const path = `/chat/${id}`;
        const payload = {
          id,
          title,
          userId,
          createdAt,
          path,
          messages: [
            ...messages,
            {
              content: completion,
              role: 'assistant'
            }
          ]
        };
//REPLACE WITH KV
        await supabase.from('chats').upsert({ id, payload }).throwOnError();
      }
    });`
coozywana commented 1 year ago

@scottklein7 Thanks, I got up to that step but I'm not sure where to put it for langchain.

Like for openai you just do const stream = OpenAIStream(res, { async onCompletion(completion) {

but with langchain its const stream = await chain.stream({ chat_history: formattedPreviousMessages.join('\n'),

coozywana commented 1 year ago

@scottklein7 Also, could you please provide the whole code, because if I connect the kv I'm not sure if I still keep the basic memory formatter for langchain

scottklein7 commented 1 year ago

@coozywana This is an approach I took in in prior iterations

` const { stream, handlers } = LangChainStream({
      async onCompletion(completion) {
        console.log('completion1111', completion)
        const title = body.messages[0].content.substring(0, 100)
        const id = body.id ?? nanoid()
        const createdAt = Date.now()
        const path = `/chat/${id}`
        const payload = {
          id,
          title,
          userId: `uid-${userId}`,
          createdAt,
          path,
          messages: [
            ...messages,
            {
              content: completion,
              role: 'assistant'
            }
          ]
        }
        await kv.hmset(`chat:${id}`, payload)
        await kv.zadd(`user:chat:${userId}`, {
          score: createdAt,
          member: `chat:${id}`
        })
      }
    })

    const streamingModel = new ChatOpenAI({
      modelName: 'gpt-4',
      streaming: true,
      verbose: true,
      temperature: 0
    })

    const nonStreamingModel = new ChatOpenAI({
      modelName: 'gpt-4',
      verbose: true,
      temperature: 0
    })

    const chain = ConversationalRetrievalQAChain.fromLLM(
      streamingModel,
      vectorStore.asRetriever(),
      {
        qaTemplate: templates.qaPrompt,
        questionGeneratorTemplate: templates.condensePrompt,
        memory: new BufferMemory({
          memoryKey: 'chat_history',
          inputKey: 'question', // The key for the input to the chain
          outputKey: 'text', // The key for the final conversational output of the chain
          returnMessages: true // If using with a chat model (e.g. gpt-3.5 or gpt-4)
        }),
        questionGeneratorChainOptions: {
          llm: nonStreamingModel
        }
      }
    )

    chain.call(
      {
        question: sanitizedQuestion,
        chat_history: chatHistory
      },
      [handlers]
    )

    // Return the readable strea
    return new StreamingTextResponse(stream)`
coozywana commented 1 year ago

@scottklein7 Thank you, a little bit confused on how to implement the kv with this. https://sdk.vercel.ai/docs/guides/providers/langchain#example

scottklein7 commented 1 year ago

@coozywana did you figure it out?

coozywana commented 1 year ago

@scottklein7 nope