Open thegrandpoobah opened 1 month ago
For reference, here is a Reddit post from about a month ago about the same problem: https://www.reddit.com/r/Bard/comments/1cg1nci/error_500_api/
I just had the same issue. Very annoying and limits the use cases for me. This is what I found in the docs:
Is there any kind of timetable to enable function calling with multimodal prompts ?
As a workaround I found the json mode functionality (only for 1.5 pro though):
It did work in my short tests with 1.5 pro and 1.5 flash, but I dont know if this will be reliable.
Issues with JSON Mode: According to the docs its only supported by 1.5 pro and the response_mime_type parameter does give me a ts error but I got consistent JSON output so far (about 20 API Calls)
When configuring Gemini/Vertex AI with a function call and a prompt that includes an image, VertexAI/Gemini throws a 500 error with no description of what the issue actually is.
Environment details
Steps to reproduce
In the documentation, all the examples for doing function calls are with text prompts, so I may just be doing something that is not supported, but I also couldn't find anything in the docs that said images are NOT supported as part of prompts for function calls.
Additionally, I have tested my prompt without the image as part of the context, and it does the function call as you would expect.
Please NOTE: I can reproduce this with both the vertex ai library and a straight up curl call as well, so this is most likely a Gemini issue rather than a library issue, but since I don't have a google support contract, I can't really open a support ticket, so for the sake of trying to bring visibility to this issue, I'm filing it here. Apologies if there is a better venue available.
POST API endpoint:
What is being passed to Vertex AI:
response:
a working JSON with only a text prompt: