feat: Add support for Structured Outputs in ChatOpenAI

Structured Outputs is a feature that ensures the model will always generate responses that adhere to your supplied JSON Schema, so you don't need to worry about the model omitting a required key, or hallucinating an invalid enum value.

final prompt = PromptValue.chat([
  ChatMessage.system(
    'Extract the data of any companies mentioned in the '
    'following statement. Return a JSON list.',
  ),
  ChatMessage.humanText(
    'Google was founded in the USA, while Deepmind was founded in the UK',
  ),
]);
final chatModel = ChatOpenAI(
  apiKey: openaiApiKey,
  defaultOptions: ChatOpenAIOptions(
    model: 'gpt-4o',
    temperature: 0,
    responseFormat: ChatOpenAIResponseFormat.jsonSchema(
      ChatOpenAIJsonSchema(
        name: 'Companies',
        description: 'A list of companies',
        strict: true,
        schema: {
          'type': 'object',
          'properties': {
            'companies': {
              'type': 'array',
              'items': {
                'type': 'object',
                'properties': {
                  'name': {'type': 'string'},
                  'origin': {'type': 'string'},
                },
                'additionalProperties': false,
                'required': ['name', 'origin'],
              },
            },
          },
          'additionalProperties': false,
          'required': ['companies'],
        },
      ),
    ),
  ),
);

final res = await chatModel.invoke(prompt);
// {
//   "companies": [
//     {
//       "name": "Google",
//       "origin": "USA"
//     },
//     {
//       "name": "Deepmind",
//       "origin": "UK"
//     }
//   ]
// }

When you use strict: true, the model outputs will match the supplied schema exactly. Mind that the strict mode only support a subset of JSON schema for performance reasons. Under-the-hood, OpenAI uses a technique known as constrained sampling or constrained decoding. For each JSON Schema, they compute a grammar that represents that schema, and pre-process its components to make it easily accessible during model sampling. This is why the first request with a new schema incurs a latency penalty. Typical schemas take under 10 seconds to process on the first request, but more complex schemas may take up to a minute.

davidmigloz / langchain_dart

feat: Add support for Structured Outputs in ChatOpenAI #526