ShipBit / slickgpt

SlickGPT is a light-weight "use-your-own-API-key" web client for the OpenAI API written in Svelte. It offers GPT-4 integration, a userless share feature and other superpowers.
https://slickgpt.vercel.app
MIT License
461 stars 99 forks source link

Replies occationally being "swallowed" due to abnormal API stream completion. #96

Open xmoiduts opened 3 months ago

xmoiduts commented 3 months ago

Also refer to #91 @Arro

I have been investigating an intermittent issue with SlickGPT where reply messages sometimes vanish unexpectedly.

Details:

Debugging Observations: I used Chrome's DevTools to observe the API response streams for both the normal and problematic reply messages. Here's what I found: (note: self-deployed instance, not OpenAI official API endpoint) Normal stream example:

message {"id":"chatcmpl-K6xxx2J","object":"chat.completion.chunk","created":1712327959,"model":"xxx","choices":[{"index":0,"delta":{"content":" with"},"finish_reason":null}],"system_fingerprint":"fp_60xxb3"} 
22:39:26.771
message {"id":"chatcmpl-K6xxx2J","object":"chat.completion.chunk","created":1712327959,"model":"xxx","choices":[{"index":0,"delta":{"content":"."},"finish_reason":null}],"system_fingerprint":"fp_60xxb3"} 
22:39:26.771
message {"id":"chatcmpl-K6xxx2J","object":"chat.completion.chunk","created":1712327959,"model":"xxx","choices":[{"index":0,"delta":{"content":""},"finish_reason":"stop"}],"system_fingerprint":"fp_60xxb3"}    
22:39:26.771
message [DONE]

Problematic stream example:

message {"id":"chatcmpl-K6xxx2J","object":"chat.completion.chunk","created":1712327959,"model":"xxx","choices":[{"index":0,"delta":{"content":" with"},"finish_reason":null}],"system_fingerprint":"fp_60xxb3"} 
22:39:26.771
message {"id":"chatcmpl-K6xxx2J","object":"chat.completion.chunk","created":1712327959,"model":"xxx","choices":[{"index":0,"delta":{"content":"."},"finish_reason":null}],"system_fingerprint":"fp_60xxb3"} 
22:39:26.771
message {"id":"chatcmpl-K6xxx2J","object":"chat.completion.chunk","created":1712327959,"model":"xxx","choices":[{"index":0,"delta":{"content":""},"finish_reason":null}],"system_fingerprint":"fp_60xxb3"}  
22:39:26.771
message [DONE]

In the problematic stream, the last chunk before the [DONE] message has finish_reason set to null instead of "stop". Also, when this happens, chrome console reports that it cannot parse [DONE] to json.

Inferred Cause:

SlickGPT likely relies on the finish_reason being set to "stop" to determine when a reply has fully streamed and should be displayed as a completed message. When the API response stream does not correctly set finish_reason to "stop", SlickGPT treats the reply as incomplete and does not properly retain the message box.

More notes: As I can't access OpenAI official API now, I use another translation API that forwards OpenAI API call scheme to OpenAI and other models like claude (offtopic: yes, with translation API and customized endpoint, the vanilla slickgpt clone can access claude). Its behavior may be different from the official GPT API.

But as the issue has already been observed during my usage of vercel slickgpt, I suspect this behavior to cause this issue.

My suggestion:

As I'm no web development or API call expert, I would only 'guess' how to solve the issue:

  1. Use [DONE] as the indicator of AI reply ending;
  2. or, add a "Retrieve Generation" button as the mitigation when the problematic reply stream happens, which stops waiting for more AI reply chunks, and truncate the message to "as is".
  3. See how other web GPT chat bots deal with API abnormalties. I haven't debugged but https://github.com/Bin-Huang/chatbox didn't seem to swallow messages with the same translated API providers during my debug of slickGPT.
Shackless commented 3 months ago

Hi and thanks for this comprehensive report! I think it's exactly what you think it is. Apparently, with your setup the end of the completion is sometimes inconsistent and SlickGPT can't handle that. The function in question is in src/lib/ChatInput.svelte and looks like this:

function handleAnswer(event: MessageEvent<any>) {
    try {
        if ($isPro) {
            // irrelevant for your case
        } else {
            const completionResponse: any = JSON.parse(event.data);
            const isFinished = completionResponse.choices[0].finish_reason === 'stop';
            if (event.data !== '[DONE]' && !isFinished) {
                const delta: string = completionResponse.choices[0].delta.content || '';
                showLiveResponse(delta);
            } else {
                addCompletionToChat();
            }
        }
    } catch (err) {
            // this will remove the message from the chat view if hit
        handleError(err);
    }
}

So apparently, something throws an error in the problematic cases and I'd like to know where exactly. Do you think you could debug this? This is probably easier then me trying to replicate your setup. It's pretty easy:

Because you're now running from source (including source maps etc.), you can now CTRL/CMD+P in the Chrome Dev Tools and put a breakpoint in ChatInput.svelte directly.

If we find a solution for this, I'll be happy to take or create a PR for the fix!

Shackless commented 1 month ago

@xmoiduts any news on this?