D7EAD / liboai

A C++17 library to access the entire OpenAI API.
https://openai.com/api/
MIT License
325 stars 55 forks source link

The AppendStreamData() function may lose some data. #62

Open pkusunjy opened 1 month ago

pkusunjy commented 1 month ago

Describe the bug

The AppendStreamData() function may lose some of the data returned by the streaming request. This happens because the response string may contain either one or multiple JSON objects, ending with data: [DONE]. Consequently, there is a possibility that the last data returned by the streaming request might look like this:

data: {"id":"xxx","object":"chat.completion.chunk",...}\ndata: [DONE]

In this scenario, locating data: [DONE] provides a valid position (not std::string::npos), but the preceding response is ignored, resulting in data loss.

To Reproduce

Fetch tag v4.0.1

Code snippets

bool liboai::Conversation::AppendStreamData(std::string data) & noexcept(false) {
    if (!data.empty()) {
        if (data.find("data: [DONE]") == std::string::npos) { // <-- here find in raw string
        }
        else {
            // the response is complete, erase the "pending" flag
            return true; // last message received
        }
    }
    return false; // data is empty
}

OS

macOS

Library version

liboai v4.0.1

fareesh commented 3 weeks ago

I managed to get this working with a few tweaks at the application layer. C++ is not my primary language so there may be better ways to achieve this. I suspect it may be best to add this to the library itself. I did not see anything in the documentation that addresses this use-case.

From what I could gather when using 3.5-turbo, the streaming chunks returned by the ChatStreamCallback in the ChatCompletion->create function consist of one or more lines of the form:

data: <partial or complete json>

OR

data: [DONE]

If the JSON is truncated, the remainder is sent in the following chunk.

Given the above, my methodology was to treat the chunk as a stringstream and split it via getline.

For each line, I first split out the data: prefix and then parse the remainder as JSON. Before attempting to parse, I add the string to a partial buffer. This is intended to reconstruct the full JSON from consecutive chunks in cases where it gets truncated.

Here there are two cases:

There are also checks to see if the contents of the line after stripping out the prefix is equal to [DONE] (to handle the overall application logic of knowing when the response is complete).

I suspect there are other language SDKs / libraries that have solved this already and so it may be more prudent to just adopt their approach, but I didn't get a chance to check them out since I was more concerned with completing the work required for my use-case.

seanchann commented 4 days ago

anyone can fix it?