Open neilp9 opened 2 months ago
It seems like your response schema lacks multiple moves (e.g., move1, move2, etc.) and does not include orderType or currentOrder. Used a schema similar to this.
{ "type": "object", "properties": { "thought": { "type": "string", "description": "A summary of the order state and how the coffee bot decides on the next move based on previous customer interactions." }, "move1": { "type": "string", "description": "The first action taken by the coffee bot.", "enum": [ "checkMenu", "addToOrder", "summarizeAndConfirm", "finishOrder", "changeItem", "removeItem", "changeModifier", "removeModifier", "cancelOrder", "greet", "close", "thanks", "redirect", "describe", "recover" ] }, "move2": { "type": "string", "description": "The second action taken by the coffee bot.", "nullable": true, "enum": [ "checkMenu", "addToOrder", "summarizeAndConfirm", "finishOrder", "changeItem", "removeItem", "changeModifier", "removeModifier", "cancelOrder", "greet", "close", "thanks", "redirect", "describe", "recover" ] }, "move3": { "type": "string", "description": "The third action taken by the coffee bot.", "nullable": true, "enum": [ "checkMenu", "addToOrder", "summarizeAndConfirm", "finishOrder", "changeItem", "removeItem", "changeModifier", "removeModifier", "cancelOrder", "greet", "close", "thanks", "redirect", "describe", "recover" ] }, "move4": { "type": "string", "description": "The fourth action taken by the coffee bot.", "nullable": true, "enum": [ "checkMenu", "addToOrder", "summarizeAndConfirm", "finishOrder", "changeItem", "removeItem", "changeModifier", "removeModifier", "cancelOrder", "greet", "close", "thanks", "redirect", "describe", "recover" ] }, "orderType": { "type": "string", "description": "Indicates the type of order; included after summarizing the order.", "nullable": true, "enum": [ "here", "to go" ] }, "response": { "type": "string", "description": "The response spoken by the coffee bot to the customer." }, "currentOrder": { "type": "array", "description": "The list of drinks and their modifiers currently in the order.", "items": { "type": "object", "description": "An item in the current order.", "properties": { "drink": { "type": "string", "description": "The name of the drink." }, "modifiers": { "type": "array", "description": "A list of modifiers applied to the drink.", "nullable": true, "items": { "type": "object", "properties": { "mod": { "type": "string", "description": "A modifier applied to the drink." } }, "required": [ "mod" ] } } }, "required": [ "drink" ] } } }, "required": [ "thought", "response" ] }
You can refer to the documentation link below for more clarity.
Thanks for the response. However, note that there is no requirement that the response schema should support multiple moves or have all of the fields you've mentioned. At the end of the day a simple schema of just thought
, response
, and move
(or move1
) as the only required fields is reasonable and should return reliably, but it doesn't.
After testing with your schema, I see that adding the "required"
field does help the model deliver more consistent outputs. This works by manually adding it in AI Studio, however I'm still experiencing the original issue over the Python SDK. Following the example in the docs, I'm unable to get Barista Bot to return consistently passing the following values for response_schema
via API. I tried both approaches below:
import typing_extensions as typing
class ModelResponse(typing.TypedDict):
response: str
move1: str
thought: str
from pydantic.dataclasses import dataclass
@dataclass
class ModelResponse:
response: str
move1: str
thought: str
model = genai.GenerativeModel('gemini-1.5-flash',
generation_config={"response_mime_type": "application/json",
"response_schema": ModelResponse})
I have run into a similar issue. Here's a minimal example
import google.generativeai as genai
import typing_extensions as typing
class Card(typing.TypedDict):
word: str
category: str
definition: list[str]
example: list[str]
prompt = "Create a German vocabulary flashcard. Include fields 'word', 'category', 'definition' and 'example'"
model = genai.GenerativeModel(
"gemini-1.5-flash",
generation_config={
"candidate_count": 1,
"response_mime_type": "application/json",
"response_schema": Card, # <<<<<<<<<<<<<<< Comment this out
},
system_instruction=prompt,
)
response = model.generate_content("befassen")
print(response.text)
#> {"word": "befassen (sich mit etwas / jemanden) [sich beschÀftigen] (verb) (reflexive verb) [to deal with, to be concerned with,
#> to occupy oneself with] (transitive verb) [to busy oneself with] (intransitive verb) [to have to do with] (informal) [to have a
#> go at] (transitive verb) [to touch] (intransitive verb) [to touch upon] (transitive verb) [to grip, to hold, to grasp] (intransitive
#> verb) [to apply oneself] (informal) [to get stuck into] (intransitive verb) [to get to grips with] (intransitive verb) [to take up]
#> (transitive verb) [to engage in] (transitive verb) [to involve oneself in] (transitive ver...
After commenting out the "response_schema" line, the result looks much more reasonable:
{"word": "befassen", "category": "Verb", "definition": "sich mit etwas beschÀftigen, sich in etwas einarbeiten", "example": "Ich befasse mich gerne mit Geschichte."}
This started happening about 10 days ago.
Thanks @kinto-b , I've seen same behavior where not sending the response_schema
generates more reasonable responses.
I tried your example in AI studio, and the results seem more reasonable. So perhaps there's a bug in the Python API?
Marking this issue as stale since it has been open for 14 days with no activity. This issue will be closed if no further activity occurs.
issue is not resolved.
Description of the bug:
The Barista Bot on AI studio is designed to output JSON in a format specified in the User prompt. From the prompt:
This works without JSON mode enabled, and even with JSON mode enabled and a blank
response_schema
specified. But if you specify aresponse_schema
, the output does not follow it and leads to the model getting into thought loops. Theresponse_schema
I'm giving it:Actual vs expected behavior:
Expected (which can be reliably generated when JSON mode is disabled):
Actual (you can get it to generate consistently inconsistent output every time you re-run, this is a particularly obvious example for illustration):
Any other information you'd like to share?
This issue doesn't seem specific to the python package, but I did reproduce it via API from my Python app. Sorry if I should have logged this in a better place!