google-gemini / generative-ai-js

The official Node.js / Typescript library for the Google Gemini API
https://www.npmjs.com/package/@google/generative-ai
Apache License 2.0
524 stars 105 forks source link

ResponseSchema with enum gives 400 Bad Request Error #188

Open navarrodiego opened 1 week ago

navarrodiego commented 1 week ago

Description of the bug:

I am using the responseSchema functionality of Gemini. I can't make it work with enums.

This is how i initialize the model:

new GenerativeModel(
        geminiApiKey, {
        model: "gemini-1.5-pro-latest",
        generationConfig: {
            responseMimeType: 'application/json',
            responseSchema: schema
        }
    })

This is a test response schema that works:

export const schema: ResponseSchema = {
  description: "Complete fields with information of the product",
  type: FunctionDeclarationSchemaType.ARRAY,
  items: {
    type: FunctionDeclarationSchemaType.OBJECT,
    properties: {
      typeOfSleeve: {
        type: FunctionDeclarationSchemaType.STRING,
        description: "",
        nullable: true
      },
    },
    required: [
     "typeOfSleeve"
    ]
  }
};

The exact same schema, with an enum fails:

export const schema: ResponseSchema = {
  description: "Complete fields with information of the product",
  type: FunctionDeclarationSchemaType.ARRAY,
  items: {
    type: FunctionDeclarationSchemaType.OBJECT,
    properties: {
      typeOfSleeve: {
        type: FunctionDeclarationSchemaType.STRING,
        description: "",
        nullable: true,
        enum: [
          "no sleeves",
          "short",
          "3/4",
          "long"
        ]
      },
    },
    required: [
     "typeOfSleeve"
    ]
  }
};

It gives me the error: [GoogleGenerativeAI Error]: Error fetching from https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-pro-latest:generateContent: [400 Bad Request] Request contains an invalid argument.

My prompt is:

const prompt = (productInformation: string): string => {
  return `Follow this JSON: <JSONSchema>${JSON.stringify(
    schema
  )}</JSONSchema>

Product information: ${productInformation}
`;
}

Actual vs expected behavior:

Actual: 400 Bad Request Error Expected: Correct 200 status code and response

Any other information you'd like to share?

I also think that the ResponseSchema interface of TypeScript has errors.

nicolab28 commented 1 week ago

I've had the same error since around Wednesday. I didn't change the code, and until Tuesday it worked perfectly.

As soon as there's an enum in a tool, I get this error: [GoogleGenerativeAI Error]: Error fetching from https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash:streamGenerateContent?alt=sse: [400 Bad Request] Request contains an invalid argument.

navarrodiego commented 1 week ago

I've had the same error since around Wednesday. I didn't change the code, and until Tuesday it worked perfectly.

As soon as there's an enum in a tool, I get this error: [GoogleGenerativeAI Error]: Error fetching from https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash:streamGenerateContent?alt=sse: [400 Bad Request] Request contains an invalid argument.

Yeah i also had it working last week with function declarations and tools. They did some changes that broke that. What I have learnt is that if you are trying to get structured output, be extremely sure that the schema you are requesting is what gemini expects. Gemini does not provide helpful feedback on what is failing, so you have to do test and error to get it. The TypeScript interfaces are not aligned with the model.

dmhall2 commented 1 week ago

Confirming the same issue. Previously working with enum values, and now, no longer working.

nicolab28 commented 1 week ago

ditto for number, it no longer works. I use zod to create my schema, and I had to use .number().int()

delgiudices commented 6 days ago

Experiencing same issue

hsubox76 commented 6 days ago

I rolled back to earlier versions of the SDK and this issue still happens so it doesn't seem to be due to changes in the SDK. I will try to look into if any changes have been made to the service.

Let me know if anyone rolls back to an earlier version of the SDK and the issue goes away - that would pinpoint it as an SDK issue, which we can address here.

hsubox76 commented 6 days ago

Quick update: if you add the field format: "enum" it seems to get you past the invalid argument error. I tested with the example given in this issue.

async function responseSchemaShirt() {
  const schema = {
    description: "Complete fields with information of the product",
    type: FunctionDeclarationSchemaType.ARRAY,
    items: {
      type: FunctionDeclarationSchemaType.OBJECT,
      properties: {
        typeOfSleeve: {
          type: FunctionDeclarationSchemaType.STRING,
          description: "",
          nullable: true,
          // add this
          format: "enum",
          enum: [
            "no sleeves",
            "short",
            "3/4",
            "long"
          ]
        },
      },
      required: ["typeOfSleeve"],
    },
  };

It's hard to set up Typescript to specify this but we can add a comment that format is required - just need to double check that the new strictness level is intentional.

delgiudices commented 6 days ago

Adding format: "enum" fixes the issue for me

navarrodiego commented 6 days ago

Thank you for your answers @hsubox76. It would be very useful if you update the TypeScript interfaces and types to make them match with what the Gemini models expect.

hsubox76 commented 6 days ago

It's a little complex to change the TypeScript interface to force the format field to be populated with the correct fields depending on the type field but if anyone is a TypeScript expert and wants to submit a PR that forces that, here's the required format values given different types:

// Supported formats: // for NUMBER type: float, double // for INTEGER type: int32, int64 // for STRING type: enum

Otherwise what I can do is add this information to the comment above the format field (which should cause it to show up in intellisense tooltips and our reference docs).

delgiudices commented 6 days ago

@hsubox76 why not change the implementation so that if enum is included it automatically passes format: "enum" ?

hsubox76 commented 6 days ago

It's a little complicated and will require some thought. If we automatically populate that field, it will overwrite anything the user populates the field with. If the user is using an enum but populated the "format" field with nothing or something other than "enum", that's good, we fixed the problem behind the scenes, but the problem is the user didn't send what they thought they did. This could cause some confusion if, for example, the user reuses the same schema to send to the REST endpoint and gets an error.

Another idea might be that we automatically populate the "format" field but we log a warning.

Or we just throw a more informational error if the enum field is used and the format field is empty, or an incompatible string.

typhoon93 commented 5 days ago

@hsubox76 thank you for you update here, I had the same issue using the PYTHON package, and adding the FORMAT field as you specified fixed it.

If someone from google is reading this - the unhelpful message from the api makes the debugging of the issue really hard, when the calls used to work just fine previously without this specification.

Other than that, I can see from the API docs that enum needs to be specified as the format as per their docs:

https://ai.google.dev/api/rest/v1beta/cachedContents#Schema => enum[]
string Optional. Possible values of the element of Type.STRING with enum format. For example we can define an Enum Direction as : {type:STRING, format:enum, enum:["EAST", NORTH", "SOUTH", "WEST"]}

navarrodiego commented 2 days ago

It's a little complex to change the TypeScript interface to force the format field to be populated with the correct fields depending on the type field but if anyone is a TypeScript expert and wants to submit a PR that forces that, here's the required format values given different types:

// Supported formats: // for NUMBER type: float, double // for INTEGER type: int32, int64 // for STRING type: enum

Otherwise what I can do is add this information to the comment above the format field (which should cause it to show up in intellisense tooltips and our reference docs).

I can help with updating the ResponseSchema TypeScript interface, but I need all the information. Do you have any documentation where you describe in detail the rules that the request must follow.