response_schema is not working when using typing.TypedDict

mwigh commented 2 months ago

Description of the bug:

When using response_schema in generate_content the response schema is not respected if the response_schema is set using a <class 'typing_extensions._TypedDictMeta'> object

Actual vs expected behavior:

I expect the response schema to be respected, since according to the documentation it should: https://ai.google.dev/gemini-api/docs/structured-output?lang=python#generate-json

The issue is probably that the fields are not set as "required". Can that be done somehow?

Any other information you'd like to share?

Code where I explicit set the schema but also explicit asks it to not respect the schema (but it still should, according to documentation):

import typing_extensions as typing
import google.generativeai as genai
import os

class Recipe(typing.TypedDict):
    recipe_name: str
    ingredients: list[str]

genai.configure(api_key=os.environ["API_KEY"])
model = genai.GenerativeModel("gemini-1.5-pro-latest")
result = model.generate_content(
    "List one popular cookie recipe. Response should be a JSON string on format {receipt_name: str}. There should only be one key in the response.",
    generation_config=genai.GenerationConfig(
        response_mime_type="application/json", response_schema=Recipe
    ),
)
print(result.text)

Hamza-nabil commented 2 months ago

This issue seems related to #541

Gunand3043 commented 1 month ago

@MarkDaoust ,there is an issue with passing text prompt schema and model configuration schema at the same time, Please find the gist here for reference .Thanks!

mwigh commented 1 month ago

@Gunand3043 The problem is probably that GenerationConfig does not set the fields as required.

Your: "Example 2: (working) - Supply a schema through model configuration" is not always working for all cases, that is how I found this issue

kinto-b commented 1 month ago

Yep there's a reprex of the Example 2 type failing here: https://github.com/google-gemini/generative-ai-python/issues/541#issuecomment-2341821334

HNx1 commented 1 month ago

EDIT: Edited to reflect below conversation

Hi, just wanted to share an example: https://colab.research.google.com/drive/1ZdyrENeHWh30Ijws0o8l2Od5M4i56HjD?usp=sharing

In this case, I tested with both the python SDK and the REST API. The REST API call works all the time, and the python SDK does not work.

I tested with both gemini pro and flash and neither produced structured outputs in the sdk, but did with the rest api.

Extra test if excluding response_schema as suggested: This now works 100% of the time, on both the sdk and the rest api. .

mwigh commented 1 month ago

I only mean that it seems that it is loosing required field when creating the response schema, and that code is also in this SDK.

See this commit that fixes it: https://github.com/mwigh/generative-ai-python/commit/f30c36d738709732598a9523683a076f7b923621

HNx1 commented 1 month ago

Ah I see. So if you supply all properties as 'required' to the REST API, it works.

Which suggests you are correct that this is a local issue to the SDK missing required flags.

I corrected the openapi schema builder function in my colab to now set required on all properties in the object, and now it works again for the REST API.

MarkDaoust commented 1 month ago

@mwigh - Nice, I'll have to investigate this deeper. Can you send that commit as a PR?

MarkDaoust commented 1 month ago

It's interesting that Schema supports both "required" and "nullable" flags. That's a subtle distinction.

mwigh commented 1 month ago

Yes I noticed that too. But I think this was the cleanest way of solving the issue, without creating new functions or modifying _build_schema(). And it is pretty obvious what it does so someone can change it further if needed in future

mwigh commented 1 month ago

@MarkDaoust Ref your comment in the PR. I think you need to decide on what should be allowed or not, e.g.:

class RecipeTypeDict(typing.TypedDict):
    recipe_name: str
    ingredient: str
    yyy: Optional[str]
    xxx: NotRequired[str]

@dataclass
class RecipeDataClass():
    recipe_name: str
    ingredient: str
    xxx: Optional[str]  #  should = None be allowed?

class ReceiptPydantic(pydantic.BaseModel):
    recipe_name: str
    ingredient: str
    xxx: Optional[str]  # should = None be allowed?

In none of RecipeDataClass or ReceiptPydantic are xxx considered optional. And if you make them optional by setting default values, then the code fail d/t 'default' exists in the dict. I showed it in this commit: https://github.com/mwigh/generative-ai-python/commit/2801c74190b3385572db0a02fc5cf747aa952558

bakeryproducts commented 1 month ago

Im not sure where exactly is the issue with pydantic schema, but @ MarkDaoust pop's required keys out of the schema here . In _schema_for_function they eventually get back there but in _schema_for_class it seems they are not?

MaKTaiL commented 3 weeks ago

I'm also having an issue with response_schema not working as intended. It appears to ignore all keys of a Dict except the first one. This is my class:

class SubtitleObject(typing.TypedDict):
    index: str
    content: str

This is the response I'm getting:

[{'index': '0'}, {'index': '1'}, {'index': '2'}, {'index': '3'}, {'index': '4'}, {'index': '5'}, {'index': '6'}, {'index': '7'}, {'index': '8'}, {'index': '9'}, {'index': '10'}]

Even the official docs show the same bug (it's missing the ingredients key):

popovidis commented 3 weeks ago

Having this issue also.

antoo05-11 commented 2 weeks ago

I recently got this issue too :((

PeterMinin commented 2 days ago

Here's a workaround for now, before #580 is merged. You can tweak it if some of your fields are optional.

from google.generativeai.types import generation_types

def force_required_fields(generation_config) -> dict:
    """
    Returns a copy with all fields in the schema marked as required.
    Workaround for https://github.com/google-gemini/generative-ai-python/issues/560.
    """
    generation_config = generation_types.to_generation_config_dict(generation_config)
    schema = generation_config["response_schema"]
    schema.required = list(schema.properties)
    return generation_config

# Usage:
generation_config = genai.GenerationConfig(
    response_mime_type="application/json",
    response_schema=MyClass,
)
generation_config = force_required_fields(generation_config)

google-gemini / generative-ai-python