vercel / ai

Build AI-powered applications with React, Svelte, Vue, and Solid
https://sdk.vercel.ai/docs
Other
9.41k stars 1.38k forks source link

Message prompt for image should support OpenAI syntax or warn user #2684

Open castortech opened 1 month ago

castortech commented 1 month ago

Feature Description

The format to specify an image with Vercel AI is different than OpenAI. OpenAI uses image_url with a content of image_url and and url sub element. Vercel uses image and a field of image for the url (or content).

Converting from OpenAI syntax results in: "TypeError: Cannot read properties of undefined (reading 'type') at eval (webpack-internal:///(rsc)/./node_modules/@ai-sdk/openai/dist/index.mjs:41:26) at Array.map () at convertToOpenAIChatMessages (webpack-internal:///(rsc)/./node_modules/@ai-sdk/openai/dist/index.mjs:39:28) at OpenAIChatLanguageModel.getArgs (webpack-internal:///(rsc)/./node_modules/@ai-sdk/openai/dist/index.mjs:254:17) at OpenAIChatLanguageModel.doStream (webpack-internal:///(rsc)/./node_modules/@ai-sdk/openai/dist/index.mjs:380:37) at fn (webpack-internal:///(rsc)/./node_modules/ai/dist/index.mjs:2942:35) at eval (webpack-internal:///(rsc)/./node_modules/ai/dist/index.mjs:328:28) at Object.startActiveSpan (webpack-internal:///(rsc)/./node_modules/ai/dist/index.mjs:257:14) at recordSpan (webpack-internal:///(rsc)/./node_modules/ai/dist/index.mjs:326:17) at eval (webpack-internal:///(rsc)/./node_modules/ai/dist/index.mjs:2914:15)"

And this is because earlier in the process here it dropped the image_url that it didn't understand and that passed through the validate prompt method. So when it got to convertToOpenAIChatMessages it was presented with a content array of text for the first element and undefined for what was originally the image_url.

To me this is probably a bug, but in fairness, even if I lost 6 hours until I got into the guts of your code, the documentation is clear about what to do, but no warning that this is an aspect that is different than OpenAI.

So here a warning at least to reject it earlier would be much easier to deal with. And maybe also a small hint in the documentation.

Use Case

See description

Additional context

No response

lgrammel commented 1 month ago

Are you using typescript or javascript? Can you provide an example of the input that resulted in this error?

castortech commented 1 month ago

I am using TS.

Here is a message causing the issue:

[
  {
    role: "system",
    content: "Today is 8/16/2024.\n\nUser Instructions:\nYou are a friendly, helpful AI assistant.",
  },
  {
    role: "user",
    content: [
      {
        type: "text",
        text: "explain image",
      },
      {
        type: "image_url",
        image_url: {
          url: "...=",
        },
      },
    ],
  },
]

and the resulting error:

[TypeError: Cannot read properties of undefined (reading 'type')]
Error in chat/openai: TypeError: Cannot read properties of undefined (reading 'type')
    at eval (webpack-internal:///(rsc)/./node_modules/@ai-sdk/openai/dist/index.mjs:41:26)
    at Array.map (<anonymous>)
    at convertToOpenAIChatMessages (webpack-internal:///(rsc)/./node_modules/@ai-sdk/openai/dist/index.mjs:39:28)
    at OpenAIChatLanguageModel.getArgs (webpack-internal:///(rsc)/./node_modules/@ai-sdk/openai/dist/index.mjs:254:17)
    at OpenAIChatLanguageModel.doStream (webpack-internal:///(rsc)/./node_modules/@ai-sdk/openai/dist/index.mjs:380:37)
    at fn (webpack-internal:///(rsc)/./node_modules/ai/dist/index.mjs:2942:35)
    at eval (webpack-internal:///(rsc)/./node_modules/ai/dist/index.mjs:328:28)
    at Object.startActiveSpan (webpack-internal:///(rsc)/./node_modules/ai/dist/index.mjs:257:14)
    at recordSpan (webpack-internal:///(rsc)/./node_modules/ai/dist/index.mjs:326:17)
    at eval (webpack-internal:///(rsc)/./node_modules/ai/dist/index.mjs:2914:15)
lgrammel commented 1 month ago

@castortech I'm getting a TS error when I try to do this:

CleanShot 2024-08-16 at 12 28 25

Did this error show up for you? Or was it a situation where any was used?

(we might need to improve our internal checks regardless of typing bc of any)

castortech commented 1 month ago

I am not forming the message manually, and a bit closer to any

Here is the type info:

let finalMessages: {
    role: string;
    content: string | ({
        type: string;
        image_url: {
            url: string;
        };
        image?: undefined;
    } | {
        type: string;
        image: string;
        image_url?: undefined;
    } | {
        type: string;
        text: string;
    })[];
}[]

that was made to be multi-provider but mostly OpenAI and that we are working to convert fully to Vercel AI (prior was using legacy provider only to stream response).

lgrammel commented 4 weeks ago

PR: https://github.com/vercel/ai/pull/2734