Portkey-AI / gateway

A Blazing Fast AI Gateway with integrated Guardrails. Route to 200+ LLMs, 50+ AI Guardrails with 1 fast & friendly API.
https://portkey.ai/features/ai-gateway
MIT License
6.32k stars 451 forks source link

Google Gemini Vision issues with multi provider config #721

Open hongkongkiwi opened 3 weeks ago

hongkongkiwi commented 3 weeks ago

What Happened?

I'd like to send some images to google gemini to use a vision call.

The problem is that Google gemini expects files to be uploaded using their file storage API, or you must pass a custom gs:// url. So basically they will not accept arbitrary http/https urls. However, other providers do accept this such as OpenAI.

When doing a single call, this is not a problem because we can explicitly handle this, but when using one of the key features of Portkey where you can have multiple providers it starts to become a problem. Since OpenAI expects a https url and Gemini expects a gs:// url.

What Should Have Happened?

I'm not sure the best way to handle this, I'm developing a vision app and would really like to use the multi provider config.

I am thinking the best way to handle would be to an additional key which is only used with google gemini. Google gemini will use this key instead of url if it exists, otherwise it will fallback to url. That way when using a multi provider config it can know which key to use based on provider.

It does become a problem though if using a config ID and you don't know if Google Gemini is in the list of providers as you have know way of knowing if you should pass this gemini specific key or not in the call.

Relevant Code Snippet

No response

Your Twitter/LinkedIn

https://www.linkedin.com/in/andysavage

hongkongkiwi commented 3 weeks ago

Actually, in thinking about this, you could have a special format like this for the image_url field. Something like this:

gs://.......|https://......

If the model is not from google (or vertex-ai), then it will simply ignore the gs:// url and move on to the other url after the delimiter (e.g. |). This keeps backwards compatibility and does not require adding any new fields.

This would also work with provider fallback etc, since other vision models would just silently ignore the gs:// as long as there's a http/https url provided it can use.

narengogi commented 1 week ago

This has been resolved in the discord thread, attaching it for reference: https://discord.com/channels/1143393887742861333/1302311458465513624

portkey ai discord server