google-gemini / generative-ai-js

The official Node.js / Typescript library for the Google Gemini API
https://ai.google.dev/
Apache License 2.0
502 stars 96 forks source link

FinishedReason is `OTHER` for safety violation #165

Closed ktalebian closed 2 weeks ago

ktalebian commented 3 weeks ago

Description of the bug:

I am using the 1.5-pro version and the latest NodeJS SDK. I've set my safety features to as high as possible.

When I test a prompt that violates safety concerns, the response comes back with ' finishReason': ' STOP. The content, however, basically says, I cannot create this because I do not support hate.

However, I rely on finishReason to determine whether the response should be accepted. What am I doing wrong?

Code is

 const { response } = await this.model.generateContent({
        generationConfig: {
          responseMimeType: 'application/json',
          responseSchema: GeminiAIExperience.RESPONSE_SCHEMA,
        },
        safetySettings: GeminiAIExperience.SAFETY,
        contents,
        systemInstruction,
      });

With SAFETY being

[
    {
      category: HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT,
      threshold: HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,
    },
    {
      category: HarmCategory.HARM_CATEGORY_HARASSMENT,
      threshold: HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,
    },
    {
      category: HarmCategory.HARM_CATEGORY_HATE_SPEECH,
      threshold: HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,
    },
    {
      category: HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT,
      threshold: HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,
    },
  ];

Actual vs expected behavior:

No response

Any other information you'd like to share?

No response

ryanwilson commented 2 weeks ago

Hi @ktalebian, thanks for reaching out.

This is working as intended - the safetyRatings on a candidate relate to the generated output text, not the prompt. In this case, I cannot create this because I do not support hate is a reasonable, not flagged answer.

If you're looking to use the model to evaluate safety before allowing a user to post on a forum for example, you could try some prompts to make Gemini act as an evaluator and output different safety values based on your criteria.

This sounds like a good topic for our forums where other folks can chime in with their experiences as it's not related specifically to NodeJS: https://discuss.ai.google.dev/.