Add Structured Outputs (OpenAI)

arnau01 commented 2 months ago

Feature Description

Leverage new OpenAI new update on safe structured outputs.

https://openai.com/index/introducing-structured-outputs-in-the-api/

Use Case

Improvement to generateObject
Additionally be able to use this at the same type as generateText with tool calling

Additional context

No response

BrianHung commented 2 months ago

https://github.com/vercel/ai/blob/0a2702eca9ec3cb6c690248abe51ff548b6afbc5/packages/openai/src/openai-chat-language-model.ts#L618

https://github.com/vercel/ai/blob/0a2702eca9ec3cb6c690248abe51ff548b6afbc5/packages/ai/core/prompt/prepare-tools-and-tool-choice.ts#L10

would be a good starting point, as strict is not copied over.

lgrammel commented 2 months ago

WIP: https://github.com/vercel/ai/pull/2582

lgrammel commented 2 months ago

Available in ai@3.3.3 and @ai-sdk/openai@0.0.43 with the structuredOutputs option for OpenAI chat models:

const {object} = await generateObject({
  model: openai('gpt-4o-2024-08-06', {
    structuredOutputs: true,
  }),
  schema: z.object({
    recipe: z.object({
      name: z.string(),
      ingredients: z.array(
        z.object({
          name: z.string(),
          amount: z.string(),
        }),
      ),
      steps: z.array(z.string()),
    }),
  }),
  prompt: 'Generate a lasagna recipe.',
});

brc-dd commented 2 months ago

Just a heads up for others, if you're getting errors like this with structuredOutputs, replace .optional() or .nullish() in your schema with .nullable().

AI_APICallError: Invalid schema for response_format 'response': In context=(), 'required' is required to be supplied and to be an array including every key in properties. Missing 'foo'

bernaferrari commented 2 months ago

Shouldn't vercel ai fix this? Change the zod destructuring to do this.

BrianHung commented 2 months ago

Official documentation on this: https://platform.openai.com/docs/guides/structured-outputs/supported-schemas

lgrammel commented 2 months ago

@brc-dd @bernaferrari

This is a limitation of the OpenAI implementation. Other providers will not have this. Therefore I don't want to put in any magic, it's imo on the user / OpenAI to solve this.

brc-dd commented 2 months ago

Yeah. I also think this SDK's behavior is fine. Maybe just a note in structuredOutputs section suggesting that some schemas might not work and a link to OpenAI docs should be sufficient.

crivano commented 2 months ago

Will structuredOutputs be implemented for the "streamText" method? I have an application that streams a long JSON to the client and it would be great if I could use the structuredOutputs constraint.

lgrammel commented 2 months ago

@crivano structuredOutputs are a flag on openai models in the AI SDK. You can use streamObject if you want to stream structured JSON.

crivano commented 2 months ago

@lgrammel , it worked perfectly! Thank you!

pommedeterresautee commented 2 months ago

@lgrammel is it possible to have structured output enabled with stream text? (aka when a tool is called by OpenAI last GPT model) So far it doesn't work in our code...

       const hasTools = Object.keys(tools).length > 0;
        const modelOptions = {
          structuredOutputs: true,
          ...(hasTools && { parallelToolCalls: false }),
        };

        const { textStream, text, toolCalls } = await streamText({
          model: openai(model, modelOptions),
          toolChoice:
            toolsChoice === null || !hasTools // when tools are empty, tool_choice must be undefined
              ? undefined
              : toolsChoice,
          messages,
          tools,
          ...

lgrammel commented 2 months ago

@pommedeterresautee it should work without any extra effort, just enable structured outputs on the model. are you on the latest version of ai and @ai-sdk/openai? if so, what are you observing that makes you think structured output for tool calling does not work?

pommedeterresautee commented 2 months ago

Yes latest version.

We got a crash from Zod telling us some enum has not been respected (a value not listed)

    cause: ZodError: [
      {
        "received": "ANALYSIS",
        "code": "invalid_enum_value",
        "options": [
          "QUESTION",
          "DRAFT",
          "REVIEW",
          "OTHER"
        ],
        "path": [
          "task"
        ],
        "message": "Invalid enum value. Expected 'QUESTION' | 'DRAFT' | 'REVIEW' | 'OTHER', received 'ANALYSIS'"
      }
    ]

  2181 |   });
  2182 |   if (parseResult.success === false) {
> 2183 |     throw new InvalidToolArgumentsError({
       | ^
  2184 |       toolName,
  2185 |       toolArgs: toolCall.args,
  2186 |       cause: parseResult.error

lgrammel commented 2 months ago

@pommedeterresautee enums are difficult for LLMs. The mapping means they'll produce strings. Have you tried adding a description? See also https://sdk.vercel.ai/docs/ai-sdk-core/tools-and-tool-calling#prompt-engineering-with-tools

pommedeterresautee commented 2 months ago

We have a description. If I am not wrong the schema is supposed to contain the authorized values

https://openai.com/index/introducing-structured-outputs-in-the-api/

What do you mean by your remark.

My understanding is that as OpenAI is using constrain generation in August version of GPT4O when structured generation is enabled and should not have the possibility to choose a token which doesn't match the authorized values.

bernaferrari commented 2 months ago

You should share the code you are doing

pommedeterresautee commented 2 months ago

Tks @bernaferrari and done :-) @Igrammel -> repro code here https://github.com/vercel/ai/issues/2683

mattkauffman23 commented 2 months ago

how does streaming work with structured output? Won't the json response be invalid until the stream is complete? I have a case where I have a response message along with some related metadata extraction that uses structured output and I'd like to stream the message portion.

crivano commented 2 months ago

This question is more to StackOverflow than to this GitHub issue, but you can use a partial JSON parser like this one: https://github.com/promplate/partial-json-parser-js

Em qui., 29 de ago. de 2024 às 10:02, mattkauffman23 < @.***> escreveu:

how does streaming work with structured output? Won't the json response be invalid until the stream is complete? I have a case where I have a response message along with some related metadata extraction that uses structured output and I'd like to stream the message portion.

— Reply to this email directly, view it on GitHub https://github.com/vercel/ai/issues/2573#issuecomment-2317599295, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA7SFFYUQW4XILXH22UH5ZLZT4LX5AVCNFSM6AAAAABMC4VPLCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMJXGU4TSMRZGU . You are receiving this because you were mentioned.Message ID: @.***>

bernaferrari commented 2 months ago

This library does that automatically using streamObject AFAIK. No need for anything else.

gutembergAtJelou commented 1 month ago

Why generateObject does not have and option for toolCalling? Lets say I have a 5 tools that retrieve different values, and then i want to use a model that takes that information and return a structured output.

vercel / ai