BerriAI / litellm

Call all LLM APIs using the OpenAI format. Use Bedrock, Azure, OpenAI, Cohere, Anthropic, Ollama, Sagemaker, HuggingFace, Replicate (100+ LLMs)

https://docs.litellm.ai/docs/

Other

10.6k stars 1.2k forks source link

[Feature]: Add Google Gemini - Day 1 - Dec 13 #1051

Closed ishaan-jaff closed 7 months ago

ishaan-jaff commented 7 months ago

The Feature

Starting on December 13, developers and enterprise customers can access Gemini Pro via the Gemini API in Google AI Studio or Google Cloud Vertex AI.

Update

This is now live in v1.14.0.

Code to test it

pip install --upgrade google-cloud-aiplatform litellm

import litellm
litellm.vertex_project = "hardy-device-38811" # Your Project ID
litellm.vertex_location = "us-central1"  # proj location

response = completion(model="gemini-pro", messages=[{"role": "user", "content": "write code for saying hi from LiteLLM"}])

janaka commented 7 months ago

@ishaan-jaff @krrishdholakia you all haven't seem any early info being released on the Gemini API have you? I presume the web API is going to be the current Vertex API. But the LLM level will be new. Is that what you all are expecting?

I've not played with Vertex AI at all. Just looking to get an endpoint deployed in our account now to get more familiar.

Hopefully, pricing isn't going to be extortionate :)

phact commented 7 months ago

gemini-pro api dropped 15 minutes ago. It didn't immediately work [with litellm] on my first try. Having a look, will add details as I go.

krrishdholakia commented 7 months ago

krrishdholakia commented 7 months ago

looks like it uses a new module in vertex ai.

krrishdholakia commented 7 months ago

Working on adding support for this now

phact commented 7 months ago

does litellm really rely on a call to https://raw.githubusercontent.com/BerriAI/litellm/main/model_prices_and_context_window.json to get the list of models?

https://github.com/BerriAI/litellm/blob/c9b83ff85321ca39dd2b13da3bf7a0244100dd08/litellm/__init__.py#L57

krrishdholakia commented 7 months ago

yes - that lets us decouple adding new models from version bumps

phact commented 7 months ago

I see, but it means you can't easily test without deploying to master?

phact commented 7 months ago

anyway, I'm having trouble finding the max_tokens for gemini-pro documented anywhere, may have to resort to empirically using tiktoken and making it barf w/ large payloads...

Would you do a place holder for now?

phact commented 7 months ago

Working on adding support for this now

https://github.com/GoogleCloudPlatform/python-docs-samples/blob/main/generative_ai/chat.py

https://github.com/GoogleCloudPlatform/python-docs-samples/blob/main/generative_ai/gemini_chat_example.py

right?

phact commented 7 months ago

so in this case we will need a new release / version bump

krrishdholakia commented 7 months ago

yea @phact

janaka commented 7 months ago

I can do a quick test if you have a branch ( though looks like you guys mostly just push on main 😀 )

krrishdholakia commented 7 months ago

Adding list of things we need to make sure support is added for:

[x] streaming
[x] completion
[x] async completion
[x] async streaming
[x] exception mapping

krrishdholakia commented 7 months ago

@janaka expect something out by 12pm PST

phact commented 7 months ago

Here we go max tokens is 8192

krrishdholakia commented 7 months ago

Here's what i'm looking at @phact

https://ai.google.dev/models/gemini

I'll also run the vertex ai model against our context window limit testing to see what actually works

krrishdholakia commented 7 months ago

Looks like it doesn't natively support streaming. Will have to fake it like sagemaker/ai21/etc. <img width="1019" alt="Screenshot 2023-12-13 at 10 02 29 AM" src="https://github.com/BerriAI/litellm/assets/17561003/5b5ae963-c536-4535-8872-8c2fc2aef54f">

Odd, since i can see it's a private var

krrishdholakia commented 7 months ago

Update on streaming, looks like they support chat.send_message(.., stream=True) for gemini cc: @ishaan-jaff

krrishdholakia commented 7 months ago

@janaka @phact initial dev release is now out - https://pypi.org/project/litellm/1.14.0.dev1/

Here's the relevant commit - https://github.com/BerriAI/litellm/commit/ef7a6e3ae1c6c41cf406fbb38eb483790fd196d1

Dev release is unstable. Full release will be out once commit has passed our ci/cd testing.

krrishdholakia commented 7 months ago

Please let me know if you see any bugs. I've also added gemini-pro testing to our ci/cd.

krrishdholakia commented 7 months ago

Code for testing:

import litellm
litellm.vertex_project = "hardy-device-38811" # Your Project ID
litellm.vertex_location = "us-central1"  # proj location

response = completion(model="gemini-pro", messages=[{"role": "user", "content": "write code for saying hi from LiteLLM"}])

janaka commented 7 months ago

raised APIError: VertexAIException - ChatSession.send_message() got an unexpected keyword argument 'temperature'.

getting this. running against source code. fairly sure it's referencing the latest commit.

krrishdholakia commented 7 months ago

noted, missed this in testing. Able to repro. thanks @janaka

janaka commented 7 months ago

Nice. FYI - I'm using via Llama Index. So might be other things. Not quite sure how to set location. I don't think additional_kwargs in llama-index LiteLLM() constructor get's passed through for that?

I have to get shut eye now. Will try again in the morning :)

phact commented 7 months ago

it works for me!

interestingly my first try gave this which seems to be a content moderation error.

vertexai.generative_models._generative_models.ResponseBlockedError: The response was blocked.

I asked it to draw an ascii art kitten eating ice cream (no idea how that is unsafe). But a simpler prompt works great!

krrishdholakia commented 7 months ago

@janaka if you have gcp credentials setup in your environment, i believe that should work - let me know if you hit any issues.

@phact lol

krrishdholakia commented 7 months ago

This is now out in v1.14.0 - pip install litellm==1.14.0 https://pypi.org/project/litellm/

janaka commented 7 months ago

"draw me a picture of a house" or "what's the sun?"

Error: VertexAIException - The response was blocked.

"what's a parrot?" works

Looks the the Responsible AI and Safety component is working overtime for Gemini

haha

haseeb-heaven commented 7 months ago

This is now out in v1.14.0 - pip install litellm==1.14.0 https://pypi.org/project/litellm/

I am going to test this out.