lobehub / lobe-chat

🤯 Lobe Chat - an open-source, modern-design AI chat framework. Supports Multi AI Providers( OpenAI / Claude 3 / Gemini / Ollama / Azure / DeepSeek), Knowledge Base (file upload / knowledge management / RAG ), Multi-Modals (Vision/TTS) and plugin system. One-click FREE deployment of your private ChatGPT/ Claude application.
https://chat-preview.lobehub.com
Other
44.35k stars 9.94k forks source link

[Bug] OpenAI o1 model timing out due to Vercel 25 second limit on initial response from Edge Function #4046

Closed nathanclay-trinity closed 1 month ago

nathanclay-trinity commented 1 month ago

📦 Environment

Vercel

📌 Version

1.19.11

💻 Operating System

Windows

🌐 Browser

Chrome

🐛 Bug Description

When OpenAI's o1 model takes more than 25 seconds for its initial response, it produces a 504 error

📷 Recurrence Steps

No response

🚦 Expected Behavior

image

📝 Additional Information

No response

lobehubbot commented 1 month ago

👀 @nathanclay-trinity

Thank you for raising an issue. We will investigate into the matter and get back to you as soon as possible. Please make sure you have given us as much context as possible.\ 非常感谢您提交 issue。我们会尽快调查此事,并尽快回复您。 请确保您已经提供了尽可能多的背景信息。

dosubot[bot] commented 1 month ago

Hey @nathanclay-trinity! I'm here to help you with bug fixes, answer questions, and even guide you on how to become a contributor. Let's squash those bugs together! 🤖

The issue you're encountering is due to Vercel's 25-second limit on the initial response from Edge Functions. This can be problematic for long-running requests, such as those involving the OpenAI o1 model.

To address this, you can leverage the streaming capabilities of the Lobe Chat framework to ensure that the response starts streaming back to the client before the 25-second limit is reached. The framework already supports streaming responses, which can help mitigate this issue.

Here are some steps to ensure streaming is properly implemented:

  1. Ensure Streaming is Enabled: Make sure that the stream parameter is set to true when making requests to the OpenAI API. This is already handled in the createChatCompletion method.

    const response = await openai.chat.completions.create(
      {
        messages: formatMessages,
        ...params,
        stream: true,
      },
      { headers: { Accept: '*/*' } },
    );
  2. Handle Streaming Responses: Use the OpenAIStream utility to process the streaming response and return it as a StreamingTextResponse.

    const stream = OpenAIStream(response);
    return new StreamingTextResponse(stream);
  3. Frontend Handling: Ensure that the frontend is set up to handle streaming responses using the fetchSSE method.

    export const fetchSSE = async (fetchFn: () => Promise<Response>, options: FetchSSEOptions = {}) => {
      const response = await fetchFn();
    
      if (!response.ok) {
        const chatMessageError = await getMessageError(response);
        options.onErrorHandle?.(chatMessageError);
        return;
      }
    
      const returnRes = response.clone();
      const data = response.body;
    
      if (!data) return;
      const reader = data.getReader();
      const decoder = new TextDecoder();
    
      let done = false;
    
      while (!done) {
        const { value, done: doneReading } = await reader.read();
        done = doneReading;
        const chunkValue = decoder.decode(value, { stream: true });
    
        options.onMessageHandle?.(chunkValue);
      }
    
      await options?.onFinish?.(output);
    
      return returnRes;
    };

By ensuring that the response starts streaming back to the client within the 25-second limit, you can avoid the 504 Gateway Timeout error. This approach leverages the existing streaming capabilities of the Lobe Chat framework to handle long-running requests more effectively [1][2].

To continue talking to Dosu, mention @dosu.

arvinxx commented 1 month ago

you can try to use fetch from client

lobehubbot commented 1 month ago

✅ @nathanclay-trinity

This issue is closed, If you have any questions, you can comment and reply.\ 此问题已经关闭。如果您有任何问题,可以留言并回复。

huoxianribao commented 1 month ago

you can try to use fetch from client 需要一个完整教程,我使用第三方api,发现已经正常返回数据,但是lobehub仍然提示504

wangfeng35 commented 1 month ago

需要一个完整教程,我使用第三方api,发现已经正常返回数据,但是lobehub仍然提示504

+1 第三方有回应 但是lobe依然504 不知道如何修改 谢谢

lobehubbot commented 1 month ago

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


I need a complete tutorial. I used a third-party API and found that the data was returned normally, but lobehub still prompted 504

+1 The third party responded, but the lobe is still 504. I don’t know how to modify it. Thank you.