Streaming works locally but not when deployed using vercel edge functions

hvri5h commented 1 year ago

I got streaming to work using an older commit of this repo and everything works fine locally. However, when I deploy the app in vercel, it doesn't stream the responses anymore.

I believe I have to use edge functions in order to do that and followed this tutorial to convert the current chat.ts API route into an edge function but I'm getting the following error while initialising the pinecone client:

"error  [PineconeError: Failed getting project name. Error: A Node.js API is used (process.nextTick) which is not supported in the Edge Runtime.
Learn more: https://nextjs.org/docs/api-reference/edge-runtime] {
  name: 'PineconeError'
}
error - node_modules/next/dist/build/webpack/loaders/next-edge-function-loader.js?absolutePagePath=%2FUsers%2Fharishtirunahari%2FCode%2Fneuraltalk-chatbot-app%2Fpages%2Fapi%2Fchat-edge.ts&page=%2Fapi%2Fchat-edge&rootDir=%2FUsers%2Fharishtirunahari%2FCode%2Fneuraltalk-chatbot-app! (10:0) @ <unknown>"

Any ideas as to why this or happening? Or any other suggestions to deploy this code and make streaming work?

Here is my code:

chat.ts

import { OpenAIStream } from '@/utils/server/stream';

export const config = {
  runtime: 'edge',
};

const handler = async (req: Request): Promise<Response> => {
  try {
    const { question, chat_history } = (await req.json()) as {
      question: string;
      chat_history: string[][];
    };

    const stream = await OpenAIStream(question, chat_history);

    return new Response(stream);
  } catch (error) {
    console.error(error);
    return new Response('Error', { status: 500 });
  }
};

export default handler;

stream.ts

import {
  PINECONE_INDEX_NAME,
  PINECONE_NAME_SPACE,
  pinecone,
} from '../data/pinecone';
import { makeChain } from './makechain';

import { OpenAIEmbeddings } from 'langchain/embeddings/openai';
import { PineconeStore } from 'langchain/vectorstores/pinecone';

export const OpenAIStream = async (
  question: string,
  chat_history: string[][],
) => {
  const sanitizedQuestion = question.trim().replaceAll('\n', ' ');
  console.log('asking question...', sanitizedQuestion);

  const stream = new ReadableStream({
    async start(controller) {
      console.log('starting stream...');
      const index = pinecone.Index(PINECONE_INDEX_NAME);

      /* create vectorstore */
      const vectorStore = await PineconeStore.fromExistingIndex(
        new OpenAIEmbeddings({}),
        {
          pineconeIndex: index,
          textKey: 'text',
          namespace: PINECONE_NAME_SPACE,
        },
      );

      const encoder = new TextEncoder();

      const sendData = (data: string) => {
        controller.enqueue(encoder.encode(data));
      };

      sendData(' ');

      // create chain
      const chain = makeChain(vectorStore, (token: string) => {
        sendData(token);
      });

      try {
        // Ask a question
        const response = await chain.call({
          question: sanitizedQuestion,
          chat_history: chat_history || [],
        });

        console.log('response', response);
      } catch (error) {
        console.log('error', error);
        controller.error(error);
      } finally {
        controller.close();
      }
    },
  });

  return stream;
};

ThomasEwing04 commented 1 year ago

this feature is ideal, have tried many fixes to no avail on Vercel

jordanparker6 commented 1 year ago

I have this issue but with a different architecture.

I have FastAPI server setup with SSE hosted on Cloud Run. This is streaming to my edge function that then streams to the client. This is working fine locally and even when its hosted I can log the tokens being streamed into the edge function and view each token log in the vercel dashboard. However, the stream starts in the client for the first token and then stops... Super weird. This only happens when its being hosted.

I believe your issues is to do with the Edge Runtime not being compatible with the pinecone library. You may need to change the setup.

felipetodev commented 1 year ago

the Pinecone SDK is Node-only 😥 https://github.com/hwchase17/langchainjs/issues/1055

I have the same problem in prod

VladislavKatsubo commented 1 year ago

Hey @haritiruna ! Have you managed to solve you problem? Also, could you please share index.ts' s handleSubmit method?

assafweinberg commented 1 year ago

@felipetodev - I think neither the pinecone sdk nor the openai client work on edge runtime. You can create a replacement client pretty easily though using their REST APIs. I verified this works in production on Vercel. It might be their clients' use of Axios.

const searchEmbeddings = async (query: string, maxResponses = 2, minConfidence = 0.8) => {
  try {
    const embeddingResult = await openai.createEmbedding({
      model: 'text-embedding-ada-002',
      input: query,
    });

    const queryVector = embeddingResult.data[0].embedding;
    const res = await fetch(
      `https://${pineconeIndexName}-${pineconeProjectID}.svc.${pineconeEnvironment}.pinecone.io/query`,
      {
        method: 'POST',
        headers: {
          'Api-Key': `${pineconeAPIKey}`,
          Accept: 'application/json',
        },
        body: JSON.stringify({
          vector: queryVector,
          includeValues: false,
          includeMetadata: true,
          namespace: pineconeNamespace,
          topK: 2,
        }),
      }
    );

    const data = await res.json();

    return data.matches
      .filter(m => m.score > minConfidence)
      .map(m => {
        return {
          text: m.metadata.text,
          score: m.score,
        };
      });
  } catch (err) {
    //log an error
    return [];
  }
};


const createChatCompletion = async (options: {
  model: string;
  messages: Array<{ content: string; role: ChatCompletionRequestMessageRoleEnum; name: string }>;
  max_tokens: number;
  temperature: number;
  stream: boolean;
}) => {
  return fetch('https://api.openai.com/v1/chat/completions', {
    method: 'POST',
    headers: {
         'Content-Type': 'application/json',
         Authorization: `Bearer ${apiKey}`,
    },
    body: JSON.stringify(options),
  });
};

derekurban2001 commented 1 year ago

For anyone else coming in from google, I've been struggling with this problem with what seems for forever and none of the answers/responses from google, stack overflower (or even this thread) has worked.

I found success by adding the following headers to my response, if you want to understand why, just search up the X-Content-Type-Options: nosniff header.

return new Response(stream, {
    headers: {
        'Content-Type': 'text/event-stream',
        'X-Content-Type-Options': 'nosniff'
    }
});

harisrab commented 1 year ago

For anyone else coming in from google, I've been struggling with this problem with what seems for forever and none of the answers/responses from google, stack overflower (or even this thread) has worked.

I found success by adding the following headers to my response, if you want to understand why, just search up the X-Content-Type-Options: nosniff header.
return new Response(stream, {
  headers: {
      'Content-Type': 'text/event-stream',
      'X-Content-Type-Options': 'nosniff'
  }
});

This didn't really work. I'm doing this right now, but no luck.

In my case, the edge-runtime functions works locally, but doesn't iterate over the res.

// Enable edge runttime
export const runtime = "edge";

export async function POST(req: Request) {
  const encoder = new TextEncoder();
  const decoder = new TextDecoder();

  const { messages, currentTerminal, user_id } = await req.json();

  console.log("Current Terminal: ", currentTerminal);

  const res = await fetch(URL_GOES_HERE_FOR_FASTAPI_SERVER, {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      "X-Content-Type-Options": "nosniff",
    },
    body: JSON.stringify({
      // Body
    }),
  });

  console.log("Fetching data");

  const transformStream = new TransformStream({
    async transform(chunk, controller) {
      const content = decoder.decode(chunk);

      controller.enqueue(encoder.encode(content));
    },
  });

  const readableStream = await new ReadableStream({
    async start(controller) {
      console.log("Starting streaming response")

     // It doesn't iterate over the body here.
      for await (const chunk of res.body as any) {
        console.log("Chunk: ", decoder.decode(chunk));
        controller.enqueue(chunk);
      }

      // controller.close();
    },
    async pull(controller) {
      controller.close();
    },
  });

  return new Response(
    readableStream.pipeThrough(transformStream),
    {
      headers: {
        "Content-Type": "text/event-stream",
        "X-Content-Type-Options": "nosniff",
      },
    }
  );
}

See in the readable stream, where you iterate, it fails to iterate, therefore, there's an error.

dosubot[bot] commented 1 year ago

Hi, @haritiruna. I'm Dosu, and I'm helping the gpt4-pdf-chatbot-langchain team manage their backlog. I wanted to let you know that we are marking this issue as stale.

From what I understand, the issue is related to streaming not working when the project is deployed using Vercel edge functions. It seems that the use of a Node.js API, specifically process.nextTick, is not supported in the Edge Runtime. Some suggestions have been made, such as changing the setup to use REST APIs instead of the Pinecone SDK. Additionally, one user shared a solution involving adding specific headers to the response. However, another user reported that this solution did not work for them and shared their code where the iteration over the response body fails.

Before we close this issue, we wanted to check with you if it is still relevant to the latest version of the gpt4-pdf-chatbot-langchain repository. If it is, please let us know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days.

Thank you for your understanding and contribution to the project.

nijynot commented 1 year ago

For anyone else coming in from google, I've been struggling with this problem with what seems for forever and none of the answers/responses from google, stack overflower (or even this thread) has worked.

I found success by adding the following headers to my response, if you want to understand why, just search up the X-Content-Type-Options: nosniff header.
return new Response(stream, {
  headers: {
      'Content-Type': 'text/event-stream',
      'X-Content-Type-Options': 'nosniff'
  }
});

Running Vercel Edge locally, adding this did indeed do the trick for me. Thank you sir.

dosubot[bot] commented 1 year ago

@mayooear Could you please help @haritiruna with this issue? They have indicated that the problem is still relevant and have shared a potential solution involving adding specific headers to the response. Thank you!

dosubot[bot] commented 9 months ago

Hi, @haritiruna,

I'm helping the gpt4-pdf-chatbot-langchain team manage their backlog and am marking this issue as stale. It seems like you encountered an error when trying to initialize the Pinecone client, which is not supported in the Vercel Edge Runtime. There have been suggestions from other users to use REST APIs instead of the Pinecone SDK and to add specific headers to the response.

Could you please confirm if this issue is still relevant to the latest version of the gpt4-pdf-chatbot-langchain repository? If it is, please let the gpt4-pdf-chatbot-langchain team know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days.

Thank you!

mayooear / gpt4-pdf-chatbot-langchain

Streaming works locally but not when deployed using vercel edge functions #233

Here is my code: