Add support for cache control in ChatCompletionMessage

I am using go-openai to request Braintrust AI proxy which allows to access models from OpenAI, Anthropic, Google, AWS, Mistral, and third-party inference providers through a single, unified API (openai).

When requesting anthropic models with prompt caching I need to perform these kind of requests

curl https://api.anthropic.com/v1/messages \
  -H "content-type: application/json" \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "anthropic-beta: prompt-caching-2024-07-31" \
  -d '{
    "model": "claude-3-5-sonnet-20241022",
    "max_tokens": 1024,
    "system": [
      {
        "type": "text",
        "text": "You are an AI assistant tasked with analyzing literary works. Your goal is to provide insightful commentary on themes, characters, and writing style.\n"
      },
      {
        "type": "text",
        "text": "<the entire contents of Pride and Prejudice>",
        "cache_control": {"type": "ephemeral"}
      }
    ],
    "messages": [
      {
        "role": "user",
        "content": "Analyze the major themes in Pride and Prejudice."
      }
    ]
  }'

Would you consider adding CacheControl in ChatCompletionMessage for this kind of use case even if it is not part of openai per se?

sashabaranov / go-openai

Add support for cache control in ChatCompletionMessage #897