Function calling with a response mime type: 'application/json' is unsupported, gemini-1.5-pro-latest/'gemini-1.5-pro-latest

sahill3 commented 3 months ago

Description of the bug:

I’m encountering an issue with the gemini-1.5-flash model from the google.generativeai package when trying to generate content with a response MIME type set to application/json. According to the documentation, this MIME type should be supported, but I’m receiving the following error:

google.api_core.exceptions.InvalidArgument: 400 Function calling with a response mime type: 'application/json' is unsupported

Note: The reason I am supplying schema as text in prompt is coz flash does not support it.

Code Example

from google.generativeai import GenerativeModel

def get_products():
    return {
        "product_name": "laptop",
        "price": 20000,
        "permalink": "#link",
        "image_link": "#image"
    }

model = GenerativeModel('gemini-1.5-flash-latest',
                        tools=[get_products],
                        generation_config={"response_mime_type": "application/json"})

system_instruction = """
You are a helpful e-commerce assistant. When responding, use the following formats:
1. Normal Response: {"type": "message", "data": str}
2. Product Response: {"type": "products", "data": list[{"product_name": str, "price": int, "permalink": str, "image_link": str}]}
"""

model.generate_content(system_instruction)

Use Case:

I am working on an e-commerce assistant that needs to generate responses in specific formats. Here’s a brief overview of my use case:

System Instruction:

I want to set up the assistant with a system instruction that defines how it should format responses. The assistant should handle two types of responses:

Normal Responses: Simple text responses with a message type.

Product Responses: Structured JSON responses with a list of products.

Normal Response:

{
  "type": "message",
  "data": "How can I help you?"
}

Products Response:

{
  "type": "products",
  "data": [
    {
      "product_name": "samsung",
      "price": 20000,
      "permalink": "#link1",
      "image_link": "#image1"
    },
    {
      "product_name": "iphone",
      "price": 100000,
      "permalink": "#link2",
      "image_link": "#image2"
    }
  ]
}

Questions

Is there an alternative way to format responses in JSON using the gemini-1.5-flash/pro model?
Are there any updates or workarounds for handling structured JSON responses with this model?

Thank you for your assistance!

gmKeshari commented 3 months ago

Hi @sahill3

The gemini-1.5-flash model you're using doesn't currently support returning JSON responses when calling functions. You can refer to this doc.

There are two ways to supply a schema to the model:

As text in the prompt. This approach works with both Gemini 1.5 Flash and Gemini 1.5 Pro.
As a structured schema supplied through model configuration. This approach works with Gemini 1.5 Pro but not Gemini 1.5 Flash.

So, either go with the first approach or try to use a different model (For example, the gemini-1.5-pro).

MarkDaoust commented 3 months ago

Interesting, I've never tried to combine these two features. It seems that you can't do them both at the same time.

So maybe you have three options:

If you have a clear sequence of states you need to go through you can force a function call (function_calling_mode=ANY), then use a different model object and ask for the next step with response_mime_type=application/json.
You could define all three responses as functions and instead of parsing json, parse the function objects it returns.
You could define all three messages as JSON and reply with the product list when the model says {"get_products": {}}.

google-gemini / generative-ai-python