Closed ishaan-jaff closed 7 months ago
@ishaan-jaff @krrishdholakia you all haven't seem any early info being released on the Gemini API have you? I presume the web API is going to be the current Vertex API. But the LLM level will be new. Is that what you all are expecting?
I've not played with Vertex AI at all. Just looking to get an endpoint deployed in our account now to get more familiar.
Hopefully, pricing isn't going to be extortionate :)
gemini-pro api dropped 15 minutes ago. It didn't immediately work [with litellm] on my first try. Having a look, will add details as I go.
looks like it uses a new module in vertex ai.
Working on adding support for this now
does litellm really rely on a call to https://raw.githubusercontent.com/BerriAI/litellm/main/model_prices_and_context_window.json
to get the list of models?
yes - that lets us decouple adding new models from version bumps
I see, but it means you can't easily test without deploying to master?
anyway, I'm having trouble finding the max_tokens for gemini-pro documented anywhere, may have to resort to empirically using tiktoken and making it barf w/ large payloads...
Would you do a place holder for now?
Working on adding support for this now
https://github.com/GoogleCloudPlatform/python-docs-samples/blob/main/generative_ai/chat.py
vs
right?
so in this case we will need a new release / version bump
yea @phact
I can do a quick test if you have a branch ( though looks like you guys mostly just push on main 😀 )
Adding list of things we need to make sure support is added for:
@janaka expect something out by 12pm PST
Here we go max tokens is 8192
Here's what i'm looking at @phact
https://ai.google.dev/models/gemini
I'll also run the vertex ai model against our context window limit testing to see what actually works
Looks like it doesn't natively support streaming. Will have to fake it like sagemaker/ai21/etc. <img width="1019" alt="Screenshot 2023-12-13 at 10 02 29 AM" src="https://github.com/BerriAI/litellm/assets/17561003/5b5ae963-c536-4535-8872-8c2fc2aef54f">
Odd, since i can see it's a private var
Update on streaming, looks like they support chat.send_message(.., stream=True)
for gemini cc: @ishaan-jaff
@janaka @phact initial dev release is now out - https://pypi.org/project/litellm/1.14.0.dev1/
Here's the relevant commit - https://github.com/BerriAI/litellm/commit/ef7a6e3ae1c6c41cf406fbb38eb483790fd196d1
Dev release is unstable. Full release will be out once commit has passed our ci/cd testing.
Please let me know if you see any bugs. I've also added gemini-pro testing to our ci/cd.
Code for testing:
import litellm
litellm.vertex_project = "hardy-device-38811" # Your Project ID
litellm.vertex_location = "us-central1" # proj location
response = completion(model="gemini-pro", messages=[{"role": "user", "content": "write code for saying hi from LiteLLM"}])
raised APIError: VertexAIException - ChatSession.send_message() got an unexpected keyword argument 'temperature'.
getting this. running against source code. fairly sure it's referencing the latest commit.
noted, missed this in testing. Able to repro. thanks @janaka
Nice. FYI - I'm using via Llama Index. So might be other things. Not quite sure how to set location. I don't think additional_kwargs in llama-index LiteLLM() constructor get's passed through for that?
I have to get shut eye now. Will try again in the morning :)
it works for me!
interestingly my first try gave this which seems to be a content moderation error.
vertexai.generative_models._generative_models.ResponseBlockedError: The response was blocked.
I asked it to draw an ascii art kitten eating ice cream (no idea how that is unsafe). But a simpler prompt works great!
@janaka if you have gcp credentials setup in your environment, i believe that should work - let me know if you hit any issues.
@phact lol
This is now out in v1.14.0
- pip install litellm==1.14.0 https://pypi.org/project/litellm/
"draw me a picture of a house" or "what's the sun?"
Error: VertexAIException - The response was blocked.
"what's a parrot?" works
Looks the the Responsible AI and Safety component is working overtime for Gemini
haha
This is now out in v
1.14.0
- pip install litellm==1.14.0 https://pypi.org/project/litellm/
I am going to test this out.
The Feature
Update
This is now live in v
1.14.0
.Code to test it