Closed the-wdr closed 3 days ago
@the-wdr workaround for this is to use the pass-through endpoint - https://docs.litellm.ai/docs/pass_through/vertex_ai
Interesting:
- model_name: finetuned-gemini
litellm_params:
model: vertex_ai/<ENDPOINT_ID>
vertex_project: <PROJECT_ID>
vertex_location: <LOCATION>
model_info:
base_model: vertex_ai/gemini-pro
i see the use of base_model
. I believe we can use that to route the call correctly (i believe it's the generateContent endpoint for gemini models)
we already support fine tuned models on Vertex AI https://docs.litellm.ai/docs/providers/vertex#fine-tuned-models and send it to the correct endpoint
@the-wdr what version of litellm are you on ? Can you try the latest version of litellm ?
link to relevant test: https://github.com/BerriAI/litellm/blob/cd8d7ca9156a5fc2510db1ef0d43956d3239eccf/litellm/tests/test_amazing_vertex_completion.py#L2230
@ishaan-jaff we support finetuned models but i believe they're currently routed to vertex ai model garden's predict endpoint
here's what i'm looking at in code to confirm this - https://github.com/BerriAI/litellm/blob/cd8d7ca9156a5fc2510db1ef0d43956d3239eccf/litellm/main.py#L2126
how would it enter that branch @krrishdholakia for a finetuned model ? a fine tuned model has model=vertex_ai/<ENDPOINT_ID>
I don't see gemini
in there
relevant pr adding vertex_ai finetuned support: https://github.com/BerriAI/litellm/pull/5371
I suspect the issue is we expect vertex fine tuned models to be using vertex_ai_beta
and it does not get routed correctly
I'm able to repro locally when using vertex_ai
instead of vertex_ai_beta
. Working on a fix
=============================================================================================================== short test summary info ===============================================================================================================
FAILED test_amazing_vertex_completion.py::test_completion_fine_tuned_model - litellm.exceptions.InternalServerError: litellm.InternalServerError: VertexAIException InternalServerError - 400 Gemini cannot be accessed through Vertex Predict/RawPredict API. Please follow https://cloud.google.com/vertex-ai/docs/generative-ai/start/quickstarts/quickstart-multimodal for Gemini usage.
@ishaan-jaff already on it
i think we can just use the base_model
given
already on it
sounds good, ill let you fix it then
Hi @the-wdr , curious do you use LiteLLM today ? If so, I'd love to hop on a call and learn how we can improve LiteLLM for you
What happened?
Bug Report: Finetuned Gemini-Pro Models Do Not Work with Vertex AI Provider
[Bug]:
Finetuned Gemini-Pro models are not functioning correctly with the
vertex_ai
provider in our setup. When attempting to make a prediction using a finetuned Gemini-Pro model, the following error is returned:Steps to Reproduce:
Use the following configuration in
config.yaml
:Test non Finetuned Model The Gemini-Pro model should work with the Vertex AI
Predict
orRawPredict
API as expected, returning predictions without error.Result
Attempt to run predictions using the
finetuned-gemini
model model via LiteLLM completion-api.Expected Behavior:
The finetuned Gemini-Pro model should work with the the vertex_ai Provider as expected, returning predictions without error.
Actual Behavior:
The API returns a 500 error with the following message:
Environment Details:
Additional Notes:
Predict/RawPredict
API, Instead of the :generateContent Endpoint.References:
Please advise on how to resolve this issue, or clarify whether these models require a different approach for usage via Litellm-Proxy.
Relevant log output
No response
Twitter / LinkedIn details
No response