anthropics / anthropic-sdk-python

MIT License
1.42k stars 170 forks source link

Batch API does not support cache_control #689

Open BrandonStudio opened 1 week ago

BrandonStudio commented 1 week ago

Batch API failed with error message

messages.5.content.0.text.cache_control: Extra inputs are not permitted

I have both indicated betas in params and client.beta.messages.batches.create

My prompts look like

{
    "custom_id": "...",
    "params": {
        "model": "claude-3-5-sonnet-20240620",
        "system": [
            {
                "type": "text",
                "text": "...",
                "cache_control": {
                    "type": "ephemeral"
                }
            }
        ],
        "messages": [
            {
                "role": "user",
                "content": "..."
            },
            {
                "role": "assistant",
                "content": "..."
            },
            {
                "role": "user",
                "content": "..."
            },
            {
                "role": "assistant",
                "content": "..."
            },
            {
                "role": "user",
                "content": "..."
            },
            {
                "role": "assistant",
                "content": [
                    {
                        "type": "text",
                        "text": "...",
                        "cache_control": {
                            "type": "ephemeral"
                        }
                    }
                ]
            },
            {
                "role": "user",
                "content": "..."
            },
            {
                "role": "assistant",
                "content": "..."
            }
        ],
        "max_tokens": 4096,
        "temperature": 0,
        "top_p": 1,
        "betas": [
            "prompt-caching-2024-07-31"
        ]
    }
}
Rallio67 commented 1 week ago

I could not get this to work and kept trying things and searching online all day with no luck. Eventually, I figured out you can make it work by using the curl command and requests in python. I have an example below that I can verify works and uses the batch api as well as the prompt caching.

import requests
import os

API_KEY = "apikey-goes-here"

url = "https://api.anthropic.com/v1/messages/batches"

headers = {
    "x-api-key": API_KEY,
    "anthropic-version": "2023-06-01",
    "anthropic-beta": "prompt-caching-2024-07-31, message-batches-2024-09-24",
    "content-type": "application/json"
}

payload = {'requests': [{'custom_id': '74a62a763d4',
   'params': {'model': 'claude-3-5-sonnet-20240620',
    'max_tokens': 100,
    'system': [{'type': 'text',
      'text': '<your cache text goes here>',
      'cache_control': {'type': 'ephemeral'}}],
    'messages': [{'role': 'user',
      'content': [{'type': 'text',
        'text': "your non-cache text goes here"}]}]}},
  {'custom_id': '32e97f21a76',
   'params': {'model': 'claude-3-5-sonnet-20240620',
    'max_tokens': 100,
    'system': [{'type': 'text',
      'text': '<your cache text goes here>',
      'cache_control': {'type': 'ephemeral'}}],
    'messages': [{'role': 'user',
      'content': [{'type': 'text',
        'text': "your non-cache text goes here"}]}]}}]}

response = requests.post(url, headers=headers, json=payload)

print(response.status_code)
print(response.json())

Will be nice if anthropic generates some examples with the python sdk that clearly shows how to combine the beta features since it unlocks a lot of cost savings for end users!

aaron-lerner commented 1 week ago

Thanks for reporting! There's 2 things going on here:

  1. betas maps the anthropic-beta header and should be specified at the top level. So client.beta.messages.batches.create(betas=[...], requests=[...]) rather than nested within requests. The current types are misleading and will be fixed.
  2. There was a bug with specifying betas at the top-level. This was fixed in v0.36.1 from earlier today. Update to that and give it a shot and let us know if you run into any further trouble.
BrandonStudio commented 1 week ago

I did have tried only specifying betas at the top level, at the very beginning. But it did not work. According to you, this is because a bug in Python API, and it has been fixed in 0.36.1. Am I right?

Unfortunately I can't try it these days. I would appreciate it if someone else test it.

BrandonStudio commented 1 week ago

BTW, I do think there are a lot of problems with type checking, the request messages in PromptCachingBetaMessageParam cannot be passed as BetaMessageParam I hope you could consider refactoring

aaron-lerner commented 1 week ago

Yes, you'll need to update to 0.36.1 to do client.beta.messages.batches.create(betas=[...], requests=[...]).

Ah, PromptCachingBeta* types are in a bit of a weird state. You can thinking of everything at client.beta.prompt_caching.messages as in its own world and separate from client.beta.messages. So PromptCachingBetaMessageParam is not meant to be used with BetaMessageParam. In a future version of the SDK we'll be removing client.beta.prompt_caching.messages and its related types and this will be simplified.

ankpatr1 commented 8 hours ago

hi , you should remove the " cache_control " field from the content . once remove this and try once hope so this should help to eliminate the error.