json mode outputs extra brackets or doesn't include

gitcagey commented 4 months ago

Description of the bug:

Experiencing very intermittent but troubling behavior where someone with response_mime_type set to 'application/json' and a schema in the prompt, gemini 1.5 pro either includes extra brackets or fails to close out with adequate brackets.

Actual vs expected behavior:

Actual behavior: gemini 1.5 pro outputs invalid json Expected behavior: gemini 1.5 pro outputs valid json

Any other information you'd like to share?

here is an example I produced in colab to specifically repro and troubleshoot (note this has never happened in google AI studio where we prototype). This started happening in our production app. To repro started running my prompts in colab to could isolate to an API call vs something in our app.

Here is a snip of the json output:

Here is the error when ran through a simple formatter online:

I remove the 2 trailing brackets:

and it formats just fine

**Note, far more common is the last 2 brackets of the json output are not included. When we add it works fine.

singhniraj08 commented 4 months ago

@gitcagey, Running the example code in Generate JSON output with the Gemini API, I can see the response output in JSON format without any extra brackets. Can you please share an example code or gist to reproduce the issue on our end. Thank you! Ref: screenshot below

LindaLawton commented 4 months ago

@gitcagey can i see your code and the prompt you are sending did you include the json mime type?

gitcagey commented 4 months ago

@singhniraj08 Didn't want to post the entire code/prompts here as is for a client. I put in a secret gist and can send to you. Also have a colab where was able to repro similar behavior. Our json is a bit more complicated than what is in the examples and cookbooks for gemini, like the recipe example, as ours is nested.

@LindaLawton yes, mime type is set to application/json but only passing schema in prompt. Not using response_schema yet.

MarkDaoust commented 4 months ago

Hi, thanks for the report. This shouldn't be happening, if you use a response schema, that should trigger constrained decoding, which should only give valid output.

Can you send me the Colab you said reproduces this? (you can get my email from git log on this repo).

MarkDaoust commented 4 months ago

Not using response_schema yet.

response_schema should work better, since that's necessary to engage the constrained decoding.

Colab

Now that I've seen this my main feedback is that you're asking it to do a lot in 1 request.

Consider using prompt chaining to make this easier for the model: "Generate the title, description and list of section titles", "for section {n}, {sections[n].title}, generate the detailed description (100 words) and the list of sub-sections"...

You can use a simpler schema for each, and join all the partial responses together to make that schema you actually want.

MarkDaoust commented 4 months ago

2 extra or missing brackets ... 2-3 minutes per call

I failed to replicate either of these, I was getting cut off for max_tokens, and is was sometimes taking >10min.

gitcagey commented 4 months ago

thanks for looking. fwiw, running with the mods you made in colab I instantly (2 mins) get invalid json. I'll look at prompt chaining.

vs manually corrected...

MarkDaoust commented 4 months ago

Okay, I've been able to reproduce this a little, I'm still seeing a lot of timeouts.

I think response_schema could help, but like you pointed out, it's not available for1.5-flash yet. and it can get trapped generating nonsense. I'll keep this open until we have a better fix.

I think your actual best bet right now is to keep doing what you're doing and retry when you get a parse error.

MarkDaoust commented 4 months ago

Oh, interesting. The success statistics are very different with response_schema if you drop the text copy of the schema from the system instructions.

I get:

Flash - 1/20
- Can't match up the braces.
Pro response_schema + text scehma in SI - 9/20
- often rambles until max_tokens
Pro text schema in si only- 19/20
- rarely mixes up the braces
Pro response_schema only 20/20
- seems good.

google-gemini / generative-ai-python