gbaptista / gemini-ai

A Ruby Gem for interacting with Gemini through Vertex AI, Generative Language API, or AI Studio, Google's generative AI services.
https://rubygems.org/gems/gemini-ai
MIT License
101 stars 21 forks source link

My api result is an array with the response broken up over multiple candidates #5

Closed joshdaloewen closed 11 months ago

joshdaloewen commented 11 months ago

I'm wondering if I'm missing a configuration. My response is broken up into numerous parts. I know I could combine them, but I'm wondering why this result is different than what I get when I run a curl command to hit the API.

I run this:

    client = Gemini.new(
      credentials: {
        service: "generative-language-api",
        api_key: ENV["GEMINI_API_KEY"],
      },
      options: { model: "gemini-pro", stream: false },
    )

    result = client.stream_generate_content({
      contents: { role: "user", parts: { text: "Write an essay on the history of Canada." } },
    })

and I get this response:

{
  "candidates"=>[
    {
      "content"=>{
        "parts"=>[
          {
            "text"=>"The history of Canada is a rich and complex tapestry of Indigenous civilizations, European exploration"
          }
        ],
        "role"=>"model"
      },
      "finishReason"=>"STOP",
      "index"=>0,
      "safetyRatings"=>[
        {
          "category"=>"HARM_CATEGORY_SEXUALLY_EXPLICIT",
          "probability"=>"NEGLIGIBLE"
        },
        {
          "category"=>"HARM_CATEGORY_HATE_SPEECH",
          "probability"=>"NEGLIGIBLE"
        },
        {
          "category"=>"HARM_CATEGORY_HARASSMENT",
          "probability"=>"NEGLIGIBLE"
        },
        {
          "category"=>"HARM_CATEGORY_DANGEROUS_CONTENT",
          "probability"=>"NEGLIGIBLE"
        }
      ]
    }
  ],
  "promptFeedback"=>{
    "safetyRatings"=>[
      {
        "category"=>"HARM_CATEGORY_SEXUALLY_EXPLICIT",
        "probability"=>"NEGLIGIBLE"
      },
      {
        "category"=>"HARM_CATEGORY_HATE_SPEECH",
        "probability"=>"NEGLIGIBLE"
      },
      {
        "category"=>"HARM_CATEGORY_HARASSMENT",
        "probability"=>"NEGLIGIBLE"
      },
      {
        "category"=>"HARM_CATEGORY_DANGEROUS_CONTENT",
        "probability"=>"NEGLIGIBLE"
      }
    ]
  }
}{
  "candidates"=>[
    {
      "content"=>{
        "parts"=>[
          {
            "text"=>", colonization, confederation, and nation-building. Spanning thousands of years, it is a story of diverse peoples, cultures, and events that have shaped"
          }
        ],
        "role"=>"model"
      },
      "finishReason"=>"STOP",
      "index"=>0,
      "safetyRatings"=>[
        {
          "category"=>"HARM_CATEGORY_SEXUALLY_EXPLICIT",
          "probability"=>"NEGLIGIBLE"
        },
        {
          "category"=>"HARM_CATEGORY_HATE_SPEECH",
          "probability"=>"NEGLIGIBLE"
        },
        {
          "category"=>"HARM_CATEGORY_HARASSMENT",
          "probability"=>"NEGLIGIBLE"
        },
        {
          "category"=>"HARM_CATEGORY_DANGEROUS_CONTENT",
          "probability"=>"NEGLIGIBLE"
        }
      ]
    }
  ]
} ... etc ...
{
  "candidates"=>[
    {
      "content"=>{
        "parts"=>[
          {
            "text"=>"th century, Canada played a major role in both World Wars, and its contributions helped to establish the country as a respected member of the international community. After the Second World War, Canada experienced a period of rapid economic growth and social change, leading to the Quiet Revolution in Quebec and the rise of a more diverse and multicultural society.\n\nToday, Canada is a modern, democratic, and prosperous country with a rich history and culture. It is a nation built on the principles of peace, order, and good government, and it continues to play an active role in global affairs, promoting peace, security, and human rights around the world."
          }
        ],
        "role"=>"model"
      },
      "finishReason"=>"STOP",
      "index"=>0,
      "safetyRatings"=>[
        {
          "category"=>"HARM_CATEGORY_SEXUALLY_EXPLICIT",
          "probability"=>"NEGLIGIBLE"
        },
        {
          "category"=>"HARM_CATEGORY_HATE_SPEECH",
          "probability"=>"NEGLIGIBLE"
        },
        {
          "category"=>"HARM_CATEGORY_HARASSMENT",
          "probability"=>"NEGLIGIBLE"
        },
        {
          "category"=>"HARM_CATEGORY_DANGEROUS_CONTENT",
          "probability"=>"NEGLIGIBLE"
        }
      ]
    }
  ]
}
gbaptista commented 11 months ago

Hey @joshdaloewen, thanks for opening an issue.

result is different than what I get when I run a curl command to hit the API

May you please share your cURL command?

By running the code you shared, here's the underlying cURL equivalent generated by the Gem:

curl --request POST \
  --url https://generativelanguage.googleapis.com/v1/models/gemini-pro:streamGenerateContent?key=$GEMINI_API_KEY \
  --header 'Content-Type: application/json' \
  --data '{
  "contents": {
    "role": "user",
    "parts": {
      "text": "Write an essay on the history of Canada."
    }
  }
}'
joshdaloewen commented 11 months ago

I figured it out, but I'm not sure how you'd want to handle it.

The problem is that

def stream_generate_content(payload, stream: nil, &callback)
     request('streamGenerateContent', payload, stream:, &callback)
end

uses the endpoint "streamGenerateContent" regardless of whether stream is true or false. But that endpoint should be "generateContent" when not streaming.

In my mind, the refactor that would be the most intuitive for users would be to:

  1. remove stream from the user configuration all together
  2. create stream_generate_content and generate_content methods, which hit request with the different endpoints and an internal stream parameter (see below)
  3. refactor your request method so that it requires the stream argument, and then it handles it as it does currently

All that being said, you could also allow stream to be passed by the user, but perhaps stream_generate_content could have a less confusing name??

gbaptista commented 11 months ago

Got it.

I infer that generateContent will eventually become deprecated, as the new API, Vertex, no longer includes this method. Regardless, you can use streamGenerateContent without streaming by choosing not to enable server-sent events (?alt=sse). Curiously, you can also use server-sent events with generateContent, even though it is not designed for "streaming". Yeah, this is confusing.

I would be inclined to distinguish between the concept of an HTTP request that creates a stream to receive server-sent events and the concept of streaming related to the expected behavior and output format of the endpoints.

Why? We have a lot of possible endpoints in the API:

All of them may support "streaming" (Server-Sent Events) or not.

I would prefer to keep the names of the methods faithful to the original names of the raw cURL API:

Allowing any of them to be accessed through a standard HTTP request or by enabling server-sent events.

Perhaps the refactoring needed to eliminate ambiguity and confusion for users would be renaming stream to server_sent_events. This change would clearly distinguish the concept of streaming from SSE, I believe:

client = Gemini.new(
  credentials: { ... },
  options: { model: 'gemini-pro', server_sent_events: true }
)

client.stream_generate_content(
  { contents: { role: 'user', parts: { text: 'hi!' } } },
  server_sent_events: true
) do |event, parsed, raw|
  puts event
end

result = client.stream_generate_content(
  { contents: { role: 'user', parts: { text: 'hi!' } } },
  server_sent_events: false
)

result = client.generate_content(
  { contents: { role: 'user', parts: { text: 'hi!' } } },
)

client.generate_content(
  { contents: { role: 'user', parts: { text: 'hi!' } } },
  server_sent_events: true
) do |event, parsed, raw|
  puts event
end

Does that make sense?

joshdaloewen commented 11 months ago

Yeah I'm liking that, and I also strongly agree that keeping the method names the same as the raw API urls will be most intuitive moving forward.

gbaptista commented 11 months ago

Done: