Gemini 1.5 Flash: Candidate was blocked due to RECITATION when responseMimeType is json

marian2js commented 4 months ago

Description of the bug:

When responseMimeType: 'application/json', a request is failing with error: [GoogleGenerativeAI Error]: Candidate was blocked due to RECITATION.

However, without responseMimeType, the same prompt works (returns a markdown with json).

The exact same instructions and prompt work on the AI Studio, even with output in JSON on.

// The error happens even if safety settings are set to block none.
const safetySettings = [
  {
    category: HarmCategory.HARM_CATEGORY_HARASSMENT,
    threshold: HarmBlockThreshold.BLOCK_NONE,
  },
  {
    category: HarmCategory.HARM_CATEGORY_HATE_SPEECH,
    threshold: HarmBlockThreshold.BLOCK_NONE,
  },
  {
    category: HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT,
    threshold: HarmBlockThreshold.BLOCK_NONE,
  },
  {
    category: HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT,
    threshold: HarmBlockThreshold.BLOCK_NONE,
  },
]

const model = this.genAI.getGenerativeModel({
  model: 'gemini-1.5-flash-latest',
  systemInstruction: instructions,
  safetySettings,
})

const generationConfig = {
  temperature: 0,
  topP: 0.95,
  topK: 64,
  maxOutputTokens: 8192,
  responseMimeType: 'application/json', // fails only if this option is sent. 
}

const chatSession = model.startChat({
  generationConfig,
})

const result = await chatSession.sendMessage(prompt)
const text = result.response.text() // throws [GoogleGenerativeAI Error]: Candidate was blocked due to RECITATION.

Actual vs expected behavior:

Actual: Throws [GoogleGenerativeAI Error]: Candidate was blocked due to RECITATION Expected: Return the same result as in the AI Studio.

Any other information you'd like to share?

No response

ryanwilson commented 4 months ago

Hi @marian2js, sorry for the troubles. Does every prompt cause this, or only specific prompts? If it's a specific prompt, are you able to share the prompt (and system instructions) so we can try to reproduce it?

marian2js commented 4 months ago

Hi @ryanwilson. The issue happens only with a very specific prompt that I cannot share publicly. I've been trying to remove the personal data from it, but as soon as I do, it starts working.

I noticed that when I remove the responseMimeType, the JSON returned in the markdown is invalid as it has a js variable: { "key": value }. However, with a different prompt the model returned the invalid json, so I don't know if the issue is related to that.

I am sorry for not being of more help.

ryanwilson commented 4 months ago

No worries! Out of curiousity, do you run into the same issue if you use sendMessageStream instead of sendMessage? That could be the difference with AI Studio, where the response is streamed.

marian2js commented 4 months ago

Hi @ryanwilson, the bug doesn't happen with sendMessageStream. I made changes to my prompt, so this bug is not triggered anymore for me. But I can confirm the bug is still happening with my old prompt and sendMessage.

florian583 commented 4 months ago

Hi guys,

I'm experiencing the exact same problem as @marian2js. We're using Gemini with Vertex to extract structured data (as JSON) from job offer listing PDFs.

Here's what I've tried so far:

Using the 1.5 Pro model instead of the Flash model
Switching to sendMessageStream from sendMessage
Using 1.0 Pro (works, but the output is non-usable)
Changing region, originally using asia-east2, switched to europe-west9, same result

Oddly enough, the issue seems to only happen when processing non-English documents. When I upload PDFs in French, I always run into the RECITATION problem.

On the flip side, if I use Google Vision for OCR on the PDF, and then use the Vercel AI SDK with chat and Gemini 1.5 Flash, it works perfectly with the same prompt, but on the OCR data (string) instead of the inline PDF.

Hope this info helps in figuring out the issue!

GuyVivedus commented 4 months ago

Hi folks,

Running into the same problem as @marian2js (with almost identical API call settings but with Golang - model conf, generationConfig, SafetyConfig all the same). Again parsing a plaintext file with a big prompt to output JSON (where the plaintext was originally converted from PDF).

Setting model.GenerationConfig.ResponseMIMEType = "application/json" hits the "blocked: candidate: FinishReasonRecitation" resp with no PromptFeedback value, annoyingly, so flying a bit blind as to the cause.

If it's helpful the content is senior school syllabus so shouldn't be anywhere close to hitting any of the harm response thresholds either way.

Commenting ResponseMIMEType with a slight prompt modification gets a usable (yet obviously inconsistent) result.

Much appreciated!

KlimentP commented 4 months ago

Hi all, Running into the same issue, model: 'gemini-1.5-flash-latest' , responseMimeType: 'application/json'.

I am parsing 50 text documents by passing a json schema for the desired output, and more than half of them fail, but always the same ones.

Setting the response type to stream does not fix it and the error occurs always in the same chunk.

Prompt is f"""You will be provided content in the form of html. Using this content, return a valid json object that is based entirely on information from the content, not your guess. The content should satisfy the following json schema: {schema} """

chanmathew commented 4 months ago

Experiencing same issue as well, same setup as others above.

Also get it if i run generateContent:

const generationConfig = {
temperature: 1,
topP: 0.95,
topK: 64,
maxOutputTokens: 8192,
responseMimeType: 'application/json'
}

const safetySettings = [
{
    category: HarmCategory.HARM_CATEGORY_HARASSMENT,
    threshold: HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE
},
{
    category: HarmCategory.HARM_CATEGORY_HATE_SPEECH,
    threshold: HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE
},
{
    category: HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT,
    threshold: HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE
},
{
    category: HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT,
    threshold: HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE
}
]
const model = genAI.getGenerativeModel({
        model: 'gemini-1.5-flash-latest',
        generationConfig,
        safetySettings,
        systemInstruction:
            'My prompt here.'
    })

const result = await model.generateContent(content)

const response = result.response
const text = response.text()

ShivQumis commented 4 months ago

Changing to stream mode worked for me as well:

const safetySettings = [
    {
      category: HarmCategory.HARM_CATEGORY_HARASSMENT,
      threshold: HarmBlockThreshold.BLOCK_NONE,
    },
    {
      category: HarmCategory.HARM_CATEGORY_HATE_SPEECH,
      threshold: HarmBlockThreshold.BLOCK_NONE,
    },
    {
      category: HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT,
      threshold: HarmBlockThreshold.BLOCK_NONE,
    },
    {
      category: HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT,
      threshold: HarmBlockThreshold.BLOCK_NONE,
    },
];

try {
            const chatSession = this.model.startChat({
                generationConfig: {
                    temperature: 1,
                    topP: 0.95,
                    topK: 64,
                    maxOutputTokens: 8192,
                    responseMimeType: "text/plain",
                },
                history: history,
                systemInstruction: systemMessage,
                safetySettings: safetySettings
            });

            var stream = true;
            if (stream) {
                const result = await chatSession.sendMessageStream(currentMessage);
              for await (const item of result.stream) {
                    console.log("Stream chunk: ", item.candidates[0].content.parts[0].text);
                } 
                const aggregatedResponse = await result.response;
                return aggregatedResponse.text();
            } else {
                const result = await chatSession.sendMessage(currentMessage);
                return result.response.text();
            }
        } catch (error) {
            console.error("Error during chat session:", error);
            throw error; // Re-throw the error after logging or handle it as needed
        }

yharaskrik commented 4 months ago

I started to receive this today as well, didn't get it at all yesterday and I made hundreds of calls, now today a significant number of them are receiving the error.

jsomeara commented 4 months ago

Also getting this error.

Kashi-Datum commented 4 months ago

Any updates on this? As it stands at the current moment, using Gemini Pro 1.5 for RAG-base Q/A with chat history is incredibly unreliable.

Simply ask 2-3 questions about a document with some overlap, and you will encounter a RECITATION error almost guranteed.

I.e, using LangChain:

const chatTemplate = ChatPromptTemplate.fromMessages([['system', 'You are a helpful assistant. Answer all questions to the best of your ability.'], new MessagesPlaceholder('history'), new MessagesPlaceholder('input')]);
const chain = chatTemplate.pipe(pdfModel).pipe(new StringOutputParser());

const chainWithHistory = new RunnableWithMessageHistory({
        runnable: chain,
        getMessageHistory: () => chatHistory,
        inputMessagesKey: 'input',
        historyMessagesKey: 'history',
        config: { configurable: { sessionId: decodedToken.uid } }
    });

const response = await chainWithHistory.invoke({ input: input });

If you simply ask the model to recite a specific section of a file/PDF twice in a row, you will get a Recitation error GURANTEED (whether you pass the input File in once at the beginning of the conversation, only in the latest message, or as part of every message input is irrelevant).

Aftab-M commented 3 months ago

I'm getting the same error. Isn't there a way to know in advance if the input contains something which might cause this error, and then skip that ??

HarshavardhanNetha commented 3 months ago

Hi @ryanwilson, the bug doesn't happen with sendMessageStream. I made changes to my prompt, so this bug is not triggered anymore for me. But I can confirm the bug is still happening with my old prompt and sendMessage.

I have tried the same, it seems the issue still persists with sendMessageStream as well. One more interesting thing I found was, when using sendMessageStream, along with some tools, it is producing 500 Internal Server Error.

0xthierry commented 3 months ago

I have the same problem, I'm sending the screenshot of a PDF, it works in some cases but not in others

jouwana commented 3 months ago

i also have the same problem.

i ran 1.0 pro vs 1,5 flash (as i got the email for it becoming deprecated soon) - i tried 1.5 pro as well

when running 1.0 & 1.5 pro, i have zero issues, no matter how many time i retry the same content generation in a row, or how many times i send (assuming im not limited)

pro

yet, when i run gemini-1.5-flash, exact same prompt, exact same code, i would randomly get the recitation error (that i am currently not handling in catch to just see it easily on console)

flash

thing is, its not even particularly faster than the 1.0 pro. the 1.5 pro also does not give this error.

i am running the free version so im comparing 1.0 pro and 1,5 flash mostly as they both have similar request limitations.

i did not change any settings, didnt set up safety or response type or anything, so all there should be the default

HarshavardhanNetha commented 3 months ago

i also have the same problem.

i ran 1.0 pro vs 1,5 flash (as i got the email for it becoming deprecated soon) - i tried 1.5 pro as well

when running 1.0 & 1.5 pro, i have zero issues, no matter how many time i retry the same content generation in a row, or how many times i send (assuming im not limited)

yet, when i run gemini-1.5-flash, exact same prompt, exact same code, i would randomly get the recitation error (that i am currently not handling in catch to just see it easily on console)

thing is, its not even particularly faster than the 1.0 pro. the 1.5 pro also does not give this error.

i am running the free version so im comparing 1.0 pro and 1,5 flash mostly as they both have similar request limitations.

i did not change any settings, didnt set up safety or response type or anything, so all there should be the default

By any chance, have you tried using tools? And experimented with various models? Also, any work done on sendMessage & sendMessageStream?

jouwana commented 3 months ago

By any chance, have you tried using tools? And experimented with various models? Also, any work done on sendMessage & sendMessageStream?

i am unsure what 'tools' is, so probably havent tried it.

i have tried different things with the different models, in prompt form, length, sending same one in a row vs sending different ones in a row, only Flash has the problem, and if pops up no matter what prompts im sending, usually after 2-3 sends, sometimes ca last to 4-5 without error but not as common.

as for sendMessage and sendMessageStream, i am mostly sending unrelated prompts, which is why i preferred using generateContent over openChat and sendMessage, and so i havent tried them for now.

chanmathew commented 3 months ago

@hsubox76 just wanted to page contributors here, this is a MAJOR bug, especially when you guys are sunsetting 1.0 pro and moving everyone to 1.5 flash. It is literally unusable right now with this recitation error. This is the only reason we cannot migrate over from other foundation models. You are losing out on customers.

Is anyone on the team looking at this?

httplups commented 3 months ago

Same problem here using Gemini 1.5 Pro

holaggabriel commented 3 months ago

I have this problem using gemini-1.5-flash. If I send the same prompt a second time it shows me this error: [GoogleGenerativeAI Error]: Candidate was blocked due to SAFETY

Prompt example: Write a romantic sentence that includes the number '5' written exactly as '5'. Make sure the number appears in numerical format and not in words.

If you send the same thing but changing only the number, it generates the error. For example: Write a romantic sentence that includes the number '73' written exactly as '73'. Make sure the number appears in numerical format and not in words.

rartin commented 3 months ago

Same here; looks like this issue is stock under the backlog pile.

hsubox76 commented 3 months ago

This does seem like a serious issue - unfortunately it's beyond the ability of us to fix in the SDK. We're asking anyone to bring issues with the service or the models to the discussion forum: https://discuss.ai.google.dev/ where it will be more likely to reach those who cover the models themselves.

If posting in the forum, feel free to link back to this issue as it contains a lot of info and examples of the problem which might be helpful background, and if someone opens a thread in the discussion forum, please link it here so that all the JS SDK users facing this issue can add to that thread and hopefully get it looked at.

We will also try to get some answers internally but it's probably best for us to use all channels, including the discussion forum.

logankilpatrick commented 3 months ago

Hey folks, following up, we are investigating this and will share more details as soon as we have an update on our end!

matadornetwork commented 3 months ago

Hey folks, following up, we are investigating this and will share more details as soon as we have an update on our end!

Hi Logan, any updates on this?

pinballsurgeon commented 3 months ago

fyi - Here is a recitation failure I find reproducible and strange, the behavior/error differences between nodejs and console may be telling, perhaps the listing/encoding causes sensitivity but no changes to parameters can prevent error -

NO ERROR - CHESS OPENINGS - instructions prompt - System: You are a service that returns tidy, single line list of domain items responses to prompts. Your response is only a comma seperated lists of items like 'item1, item2, item3..'. List as many items as performantly possible to enrich the data. If prompted for a list of fruits, you respond with 'apple, pear, banana, ..'. Do not include ‘..’ in the response, instead add as many items (10-50) as needed to fulfil the request in a timely manner. \n\nHuman: Given these instructions, the prompt asks for a comma separated list of chess openings, return your tidy response of items. \n\nAssistant: Here is the requested list of comma seperated domain items;

RECITATION ERROR - POKEMON - instructions prompt - System: You are a service that returns tidy, single line list of domain items responses to prompts. Your response is only a comma seperated lists of items like 'item1, item2, item3..'. List as many items as performantly possible to enrich the data. If prompted for a list of fruits, you respond with 'apple, pear, banana, ..'. Do not include ‘..’ in the response, instead add as many items (10-50) as needed to fulfil the request in a timely manner. \n\nHuman: Given these instructions, the prompt asks for a comma separated list of pokemon, return your tidy response of items. \n\nAssistant: Here is the requested list of comma seperated domain items;

saranggupta94 commented 3 months ago

I am facing the same issue. Following this thread.

shashank734 commented 3 months ago

Facing the same issue with 1.5 Pro.

Samson-DB commented 3 months ago

Facing the same issue for a RAG use case using gemini-1.5.pro. It's very unpredictable as the error depends on the output and not always the context/prompt.

chreds commented 3 months ago

Facing the same problem.

anthonylee991 commented 3 months ago

Ran into this issue as well today.

iliane5 commented 3 months ago

Having the same issues with flash and pro. Impossible to use them in production.

stebansaa commented 3 months ago

having this same problem

MatejaTrik commented 2 months ago

Same issue here, although i am summarizing books from PDF provided by me. Could it be something about Authors' rights?

jamesjhonatan123 commented 2 months ago

same issue here

matadornetwork commented 2 months ago

Hey folks, following up, we are investigating this and will share more details as soon as we have an update on our end!

Logan any update?

0xSMW commented 2 months ago

can confirm same issue still

logankilpatrick commented 2 months ago

Hey folks, I am going to declare an internal code red on this until it is resolved, sorry it has taken so long. Will provide updates twice a week until it is resolved.

logankilpatrick commented 2 months ago

Update: We have found what we believe to be part of the problem when sending requests with JSON mode enabled. Potential fix in the works, will share more on the timeline once we see how it goes.

Note that it's not clear everyone in this thread has the same issue, the current fix we are looking at is for specifically requests sent with JSON mode enabled using 1.5 Flash and 1.5 Pro.

chanmathew commented 2 months ago

Update: We have found what we believe to be part of the problem when sending requests with JSON mode enabled. Potential fix in the works, will share more on the timeline once we see how it goes.

Note that it's not clear everyone in this thread has the same issue, the current fix we are looking at is for specifically requests sent with JSON mode enabled using 1.5 Flash and 1.5 Pro.

You're the best Logan! Appreciate the urgency! 👏

ShivQumis commented 2 months ago

Update: We have found what we believe to be part of the problem when sending requests with JSON mode enabled. Potential fix in the works, will share more on the timeline once we see how it goes.

Note that it's not clear everyone in this thread has the same issue, the current fix we are looking at is for specifically requests sent with JSON mode enabled using 1.5 Flash and 1.5 Pro.

Just pointing out significant amount of examples with errors do not have JSON mode enabled.

matadornetwork commented 2 months ago

Just pointing out significant amount of examples with errors do not have JSON mode enabled.

Yes echoing this statement. We found it happening with both JSON mode enabled and disabled.

baxteran commented 2 months ago

Hi, I'm also suffering from the RECITATION problem when using Gemini Pro 1.5 with llamaIndex CondensePlusContextChatEngine. The problem only occurs (for me) on subsequent turns of the chat, not the first. Whether an exception is thrown seems to be dependent on the RAG and the user question, but is repeatable for the same conversation sequences. Looking forward to your fix - thanks very much!

logankilpatrick commented 2 months ago

Update: we discovered a bug related to JSON Schema mode and recitation checks, we rolled out the change yesterday and are monitoring the impact. This should help the issue of recitation errors when in JSON mode.

Note that as mentioned, this doesn't capture all of the recitation comments in this thread which is a mix of different use cases. We have not identified any known issues with other parts of the recitation checking at this time but are still investigating.

ShivQumis commented 2 months ago

Thanks Logan. Appreciate the update. However I do have to say it's disappointing that out of all the example provided above and the issue being open for almost four months that the team at google is unable to identify the issue (with non json mode, which are majority of the case) or act with a sense of urgency to remediate it. The inconsistent but frequent occurrence (almost daily) of this error effectively makes the API useless for our use-case and we switched back to Claude.

iliane5 commented 2 months ago

We have not identified any known issues with other parts of the recitation checking at this time but are still investigating.

If sending problematic prompts helps troubleshooting, I think everyone in this thread and on the bug tracker (https://issuetracker.google.com/issues/331677495?pli=1) would be more than happy to send a few ;)

The inconsistent but frequent occurrence (almost daily) of this error effectively makes the API useless for our use-case and we switched back to Claude.

Same here. I'm still hoping for the bug to be fixed at some point but it looks like this is somehow expected behaviour (?) if no issue with the recitation checker has been found.

gitcagey commented 2 months ago

Thanks @logankilpatrick! We have been fighting with this over past week trying to pull glossary terms out some documents we use for work. Almost to point where was going to have to consider changing LLMs.

I changed some of our prompting around today and did not get the error where we usually do (json mode enabled using 1.5 pro). Thought was my prompting but maybe was this change. Will test further.

UPDATE 7/18 - still getting this error, but not as much. Doing nothing out of the ordinary. Feels like maybe some overly protective copyright intent getting misapplied to the context (not just the training data) even though there is no copyright issues here.

KlimentP commented 2 months ago

Can confirm that previous test examples (gemini flash 1.5 in JSON mode) that used to cause the error, no longer cause issues

benripka commented 2 months ago

Our users get hundreds of RECITATION errors a day, with JSON mode disabled. Luckily we fallback to other LLM APIs who do not suffer from such a glaring flaw, and can handle the exact same prompts fine. Typically, the error occurs when the prompt requests that the LLM repeat something from earlier in the conversation, with some formatting change, for example. Many such prompts result in RECITATION errors. Is this the intent behind the RECITATION blocker? That it prevents the model from repeating text strings from earlier in the chat conversation? If this is the case than it greatly reduces the utility of the API. I was under the impression that the intent was to block reciation of training data.

gitcagey commented 2 months ago

Any update on this? it is back to blocking us now. We are trying to use gemini to extract key terms/glossary from documents. There are some docs that it fails nearly 100% of time with, even with some of the creative prompting.

google-gemini / generative-ai-js