alexrudall / ruby-openai

OpenAI API + Ruby! 🤖❤️ NEW: Assistant Vector Stores
MIT License
2.61k stars 302 forks source link

Streamed Responses Missing First Token in Majority of Requests #364

Closed Michael3434 closed 7 months ago

Michael3434 commented 7 months ago

I've been encountering an issue on my application since 3/4 days (www.zipchat.ai) where approximately 90% of the streamed responses from the OpenAI API are missing the first token. This behavior is observed when using the streaming feature of the gem.

Environment:

Gem : ruby-openai Ruby Version: ruby '3.1.2'

Steps:

Send a request to the OpenAI API using the streaming feature.

Observe the streamed chunks of responses.

In this example, notice that the first chunk containing content begins with a comma ",", which is not the expected start of the sentence.

Expected Behavior: The streamed response should include the entire content from the beginning of the response.

Actual Behavior: The first token of the response is missing in the streamed chunks. The first chunk with content in this example starts with a comma "," indicating a missing initial part of the response.

Code Snippet:


def call_openai(&block)
    openai_client.chat(
      parameters: {
        model: "gpt-4-0613",
        messages: messages,
        temperature: 0,
        max_tokens: 300,
        stream: stream_proc(&block),
      },
    )
  end

  def stream_proc(&block)
    proc do |chunk, _|
      p chunk
      content_match = chunk.dig("choices", 0, "delta", "content")
      if content_match
        block.call(content_match) if block_given?
      end
    end
  end

Then I was printing the "chunk" from the reply to see what was coming:

First chunk received: {"id"=>"chatcmpl-8JI8bZmSFwrFU4HDNV3qv7bvkopwK", "object"=>"chat.completion.chunk", "created"=>1699608645, "model"=>"gpt-4-0613", "choices"=>[{"index"=>0, "delta"=>{"role"=>"assistant", "content"=>""}, "finish_reason"=>nil}]}
Second chunk received: {"id"=>"chatcmpl-8JI8bZmSFwrFU4HDNV3qv7bvkopwK", "object"=>"chat.completion.chunk", "created"=>1699608645, "model"=>"gpt-4-0613", "choices"=>[{"index"=>0, "delta"=>{"content"=>","}, "finish_reason"=>nil}]}

Additional Observations:

This issue does not occur when running the chat without the stream option.

Adjusting the "n" parameter to 2 results in receiving the first token correctly, but the rest of the response is duplicated, which is not a viable solution.

I also tried to use other models, GPT-4 turbo, GPT-3.5-turbo, it happens on all of them.

Also tried to change my prompt with a single line, very basic. Same issue.

Any insights or suggestions on how to resolve this issue would be greatly appreciated.

simaofreitas commented 7 months ago

Was coming here to report exactly the same :D

I found a couple of things.

in http.rb I removed the wrapping to_json on the json_post method call. This allowed me to see the error happening.

It seems the first part of the response is chunked and not valid JSON. It is separated by 2 \n\n but the gsub can't parse it correctly since there is no complete JSON to "edit".

Here's an example:

First chunk:

"data: {\"id\":\"chatcmpl-8JIZf2lNv8PXWPS01qwn4d19nVmkn\",\"object\":\"chat.completion.chunk\",\"created\":1699610323,\"model\":\"gpt-4-1106-preview\",\"system_fingerprint\":\"fp_a24b4d720c\",\"choices\":[{\"index\":0,\"delta\":{\"role\":\"assistant\",\"content\":\"\"},\"finish_reason\":null}]}\n\ndata: {\"id\":\"chatcmpl-8JIZf2lNv8PXWPS01qwn4d1"

Second chunk:

"9nVmkn\",\"object\":\"chat.completion.chunk\",\"created\":1699610323,\"model\":\"gpt-4-1106-preview\",\"system_fingerprint\":\"fp_a24b4d720c\",\"choices\":[{\"index\":0,\"delta\":{\"content\":\"F\"},\"finish_reason\":null}]}\n\n"

Meaning there are 2 "data" fields in a single text object that is returned in 2 chunks... The second chunk includes the "correct" first token I'm interested in. But it is not parsed correctly even because the data portion (that the regex uses: chunk.scan(/(?:data|error): (\{.*\})/i) ) comes in the first chunk and thus this chunk is discarded, leading to the missing "first token".

Hope this helps with the fixing.

some more examples of people mentioning this in the official community:

Guessing this is a bug from OpenAI...

Michael3434 commented 7 months ago

Thanks a lot @simaofreitas, trying now!

simaofreitas commented 7 months ago

@Michael3434 this won't be a "fix". This was context to fix the underlying issue in the gem. Not sure there is a super reliable way to fix this tbh but maybe @alexrudall has ideas

alexrudall commented 7 months ago

Hey - sorry to hear about this. What version of ruby-openai are you both running on? @Michael3434 @simaofreitas

Michael3434 commented 7 months ago

Hey Actually it was ruby-openai (4.3.1) but I just updated the version to ruby-openai (6.0.1) and it works now... Very sorry for this.

Thanks again @alexrudall for this gem, it's amazing :) and thanks for your fast reply!

alexrudall commented 7 months ago

No worries at all, I love when this is the case ;) - for future readers, version 5.2 fixed this issue.

simaofreitas commented 7 months ago

Oh! Sorry for this. Thanks a lot. Will upgrade and try out

On Fri, Nov 10 2023 at 12:15 PM, Alex < @.*** > wrote:

No worries at all, I love when this is the case ;) - for future readers, version 5.2 ( https://github.com/alexrudall/ruby-openai/releases/tag/v5.2.0 ) fixed this issue.

— Reply to this email directly, view it on GitHub ( https://github.com/alexrudall/ruby-openai/issues/364#issuecomment-1805533952 ) , or unsubscribe ( https://github.com/notifications/unsubscribe-auth/AAAXUVHWD6BSYNJIOP5XGZLYDYEDXAVCNFSM6AAAAAA7F4WIW2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMBVGUZTGOJVGI ). You are receiving this because you were mentioned. Message ID: <alexrudall/ruby-openai/issues/364/1805533952 @ github. com>