langchain-ai / langchainjs

🦜🔗 Build context-aware reasoning applications 🦜🔗
https://js.langchain.com/docs/
MIT License
12.75k stars 2.2k forks source link

mimeType is always set to image/png when loading pdf with GCS uri #6990

Open deathemperor opened 1 month ago

deathemperor commented 1 month ago

Checked other resources

Example Code

const prompt = ChatPromptTemplate.fromMessages([
  new HumanMessage({
    content: [
      {
        type: "image_url",
        image_url: {
          url: "gs://papaya-gemini-temp/file.pdf",
          mimeType: "application/pdf",
        },
      },
      {
        type: "text",
        text: `
Extract all 100 rows from the image. do not limit the number of rows.

You must format your output as a JSON value that adheres to a given "JSON Schema" instance.

"JSON Schema" is a declarative language that allows you to annotate and validate JSON documents.

For example, the example "JSON Schema" instance {{"properties": {{"foo": {{"description": "a list of test words", "type": "array", "items": {{"type": "string"}}}}}}, "required": ["foo"]}}}}
would match an object with one required property, "foo". The "type" property specifies "foo" must be an "array", and the "description" property semantically describes it as "a list of test words". The items within "foo" must be strings.
Thus, the object {{"foo": ["bar", "baz"]}} is a well-formatted instance of this example "JSON Schema". The object {{"properties": {{"foo": ["bar", "baz"]}}}} is not well-formatted.

Your output will be parsed and type-checked according to the provided schema instance, so make sure all fields in your output match the schema exactly and there are no trailing commas!

Here is the JSON Schema instance your output must adhere to. Include the enclosing markdown codeblock:
\`\`\`json
${JSON.stringify(zodToJsonSchema(zodObject))}
\`\`\`
    `,
      },
    ],
  }),
]);

Error Message and Stack Trace (if applicable)

call to gauth.request opts= {
  "url": "https://us-central1-aiplatform.googleapis.com/v1/projects/xxx/locations/us-central1/publishers/google/models/gemini-1.5-pro-002:generateContent",
  "method": "POST",
  "headers": {
    "User-Agent": "langchain-js/0-ChatConnection google-api-nodejs-client/8.9.0",
    "Client-Info": "0-ChatConnection",
    "x-goog-user-project": "xxx",
    "Authorization": "Bearer xxx",
    "x-goog-api-client": "gl-node/22.6.0"
  },
  "data": {
    "contents": [
      {
        "role": "user",
        "parts": [
          {
            "fileData": {
              "mimeType": "image/png",
              "fileUri": "gs://papaya-gemini-temp/file.pdf"
            }
          },
          {
            "text": "\nExtract all 100 rows from the image. do not limit the number of rows.\n\nYou must format your output as a JSON value that adheres to a given \"JSON Schema\" instance.\n\n\"JSON Schema\" is a declarative language that allows you to annotate and validate JSON documents.\n\nFor example, the example \"JSON Schema\" instance {{\"properties\": {{\"foo\": {{\"description\": \"a list of test words\", \"type\": \"array\", \"items\": {{\"type\": \"string\"}}}}}}, \"required\": [\"foo\"]}}}}\nwould match an object with one required property, \"foo\". The \"type\" property specifies \"foo\" must be an \"array\", and the \"description\" property semantically describes it as \"a list of test words\". The items within \"foo\" must be strings.\nThus, the object {{\"foo\": [\"bar\", \"baz\"]}} is a well-formatted instance of this example \"JSON Schema\". The object {{\"properties\": {{\"foo\": [\"bar\", \"baz\"]}}}} is not well-formatted.\n\nYour output will be parsed and type-checked according to the provided schema instance, so make sure all fields in your output match the schema exactly and there are no trailing commas!\n\nHere is the JSON Schema instance your output must adhere to. Include the enclosing markdown codeblock:\n```json\n{\"type\":\"array\",\"items\":{\"type\":\"object\",\"properties\":{\"categoryName\":{\"type\":\"string\",\"description\":\"Row without STT and Số lượng\"},\"social_insurance_coverage\":{\"type\":\"number\",\"description\":\"extract from column Quỹ BHYT\"},\"social_insurance_copay\":{\"type\":\"number\",\"description\":\"Extract from column Người bệnh cùng chi trả\"},\"paid_by_patient\":{\"type\":\"number\",\"description\":\"Extract from column Người bệnh tự trả\"},\"children\":{\"type\":\"array\",\"items\":{\"type\":\"object\",\"properties\":{\"serviceName\":{\"type\":\"string\",\"description\":\"Extract from column Nội dung\"},\"quantity\":{\"type\":[\"number\",\"null\"],\"description\":\"Extract from column Số lượng, leave empty if not applicable\"},\"social_insurance_coverage\":{\"type\":\"number\",\"description\":\"extract from column Quỹ BHYT\"},\"social_insurance_copay\":{\"type\":\"number\",\"description\":\"Extract from column Người bệnh cùng chi trả\"},\"paid_by_patient\":{\"type\":\"number\",\"description\":\"Extract from column Người bệnh tự trả\"}},\"required\":[\"serviceName\",\"quantity\",\"social_insurance_coverage\",\"social_insurance_copay\",\"paid_by_patient\"],\"additionalProperties\":false,\"description\":\"Match all fields with data in horizontal row\"},\"description\":\"Next rows with STT and Số lượng, until Số lượng is empty\"}},\"required\":[\"categoryName\",\"social_insurance_coverage\",\"social_insurance_copay\",\"paid_by_patient\",\"children\"],\"additionalProperties\":false},\"$schema\":\"http://json-schema.org/draft-07/schema#\"}\n```\n    "
          }
        ]
      }
    ],
    "generationConfig": {
      "temperature": 0,
      "topK": 40,
      "topP": 0.8,
      "maxOutputTokens": 8192,
      "stopSequences": []
    }
  },
  "responseType": "json"
}
call to gauth.request message: Provided image is not valid.

Description

Trying to pass gcs uri

System Info

"@langchain/core": "^0.3.9", "@langchain/google-vertexai": "^0.1.0",

dosubot[bot] commented 1 month ago

Hey @deathemperor! I'm here to help you with any bugs, questions, or contributions you have for LangChain.js. Let's work together to solve this issue!

The issue you're encountering seems to be related to the incorrect MIME type being set for a PDF file when using a GCS URI. The MIME type is being set to "image/png" instead of "application/pdf". This can cause problems when the file is processed, as the system expects an image rather than a PDF.

In LangChain.js, the MIME type is typically set when creating a Blob object. If the MIME type is not correctly set or overridden, it can lead to such issues. The buildSetMetadata method in the BlobStoreGoogleCloudStorageBase class is responsible for setting the MIME type in the metadata, using the mimetype property of the MediaBlob object [1].

To resolve this, you should ensure that the MIME type is correctly specified when creating the Blob or when setting the metadata. If the issue persists, it might be necessary to check the implementation details of how the MIME type is being set or overridden in your specific use case.

To continue talking to Dosu, mention @dosu.