Structured Outputs is a feature that ensures the model will always generate responses that adhere to your supplied JSON Schema, so you don't need to worry about the model omitting a required key, or hallucinating an invalid enum value.
final prompt = PromptValue.chat([
ChatMessage.system(
'Extract the data of any companies mentioned in the '
'following statement. Return a JSON list.',
),
ChatMessage.humanText(
'Google was founded in the USA, while Deepmind was founded in the UK',
),
]);
final chatModel = ChatOpenAI(
apiKey: openaiApiKey,
defaultOptions: ChatOpenAIOptions(
model: 'gpt-4o',
temperature: 0,
responseFormat: ChatOpenAIResponseFormat.jsonSchema(
ChatOpenAIJsonSchema(
name: 'Companies',
description: 'A list of companies',
strict: true,
schema: {
'type': 'object',
'properties': {
'companies': {
'type': 'array',
'items': {
'type': 'object',
'properties': {
'name': {'type': 'string'},
'origin': {'type': 'string'},
},
'additionalProperties': false,
'required': ['name', 'origin'],
},
},
},
'additionalProperties': false,
'required': ['companies'],
},
),
),
),
);
final res = await chatModel.invoke(prompt);
// {
// "companies": [
// {
// "name": "Google",
// "origin": "USA"
// },
// {
// "name": "Deepmind",
// "origin": "UK"
// }
// ]
// }
When you use strict: true, the model outputs will match the supplied schema exactly. Mind that the strict mode only support a subset of JSON schema for performance reasons. Under-the-hood, OpenAI uses a technique known as constrained sampling or constrained decoding. For each JSON Schema, they compute a grammar that represents that schema, and pre-process its components to make it easily accessible during model sampling. This is why the first request with a new schema incurs a latency penalty. Typical schemas take under 10 seconds to process on the first request, but more complex schemas may take up to a minute.
Structured Outputs is a feature that ensures the model will always generate responses that adhere to your supplied JSON Schema, so you don't need to worry about the model omitting a required key, or hallucinating an invalid enum value.
When you use
strict: true
, the model outputs will match the supplied schema exactly. Mind that the strict mode only support a subset of JSON schema for performance reasons. Under-the-hood, OpenAI uses a technique known as constrained sampling or constrained decoding. For each JSON Schema, they compute a grammar that represents that schema, and pre-process its components to make it easily accessible during model sampling. This is why the first request with a new schema incurs a latency penalty. Typical schemas take under 10 seconds to process on the first request, but more complex schemas may take up to a minute.