alexrudall / ruby-openai

OpenAI API + Ruby! 🤖❤️
https://insertrobot.com
MIT License
2.83k stars 336 forks source link

Issue with streaming with Gemini #547

Open eltoob opened 6 days ago

eltoob commented 6 days ago

Describe the bug Gemini just announced the support for openai library. See here: https://ai.google.dev/gemini-api/docs/openai For some reason, the ruby library doesn't stream (or I guess to be more precise it streams all the response at once. Tried the exact same request the python library and it streams properly

To Reproduce Steps to reproduce the behavior:

  1. Go to https://ai.google.dev/
  2. Generate a key
  3. Run the code below
  4. There is no streaming
require 'openai'

client = OpenAI::Client.new(
  access_token: "API_KEY",
  uri_base: "https://generativelanguage.googleapis.com/v1beta/openai/"
)
start_time = Time.now
puts start_time
response = client.chat(
  parameters: {
    model: "gemini-1.5-flash",
    messages: [
      { role: "system", content: "You are a helpful assistant." },
      { role: "user", content: "Hello! write a poem about the moon make it 2000 words" }
    ],
    stream: proc do |chunk|
      current_time = Time.now
      elapsed = current_time - start_time
      puts "#{current_time}: chunk (#{elapsed.round(2)}s elapsed)"
    end
  }
)

You can execute the same code with python and you will see that the stream will work properly

from openai import OpenAI
import time

start_time = time.time()

client = OpenAI(
    api_key="API_KEY",
    base_url="https://generativelanguage.googleapis.com/v1beta/openai/"
)

response = client.chat.completions.create(
  model="gemini-1.5-flash",
  messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello! write a poem about the moon"}
  ],
  stream=True
)

for chunk in response:
    current_time = time.time()
    elapsed = current_time - start_time
    print(f"{elapsed:.2f}s elapsed:"

If you try to run the following code Expected behavior A clear and concise description of what you expected to happen.

Screenshots Here I logged the time and you can see that with ruby it returns all the chunks at once

image

Now with python it will actually stream,

image
eltoob commented 1 day ago

ok quick update, I try to replicate the exact same headers as the python library When i pass "Accept-Encoding" => "gzip, deflate", as a header, it's kinda working (ie I do see the proc working but there are issues with eventstreamer

eltoob commented 22 hours ago

Ok I finally fixed the issue.

OpenAI.configure do |config|
  config.extra_headers = {
    "Accept-Encoding" => ""
  }
end
eltoob commented 22 hours ago

Ok I finally fixed the issue.

OpenAI.configure do |config|
  config.extra_headers = {
    "Accept-Encoding" => ""
  }
end

Not sure why