When using the experimental_providerMetadata property for caching prompts with Anthropic provider, the token usage that is reported is inaccurate. The prompt tokens reported are only counting the non-cached part of the prompt, even though there is still (smaller) costs with using the cached section of the prompt.
Code example
import { createAnthropic } from "@ai-sdk/anthropic";
import { createOpenAI } from "@ai-sdk/openai";
import { generateObject } from "ai";
const anthropicProvider = createAnthropic({
apiKey: env.ANTHROPIC_API_KEY,
});
const model = anthropicProvider.languageModel("...", { cacheControl: true })
await generateObject({
model,
schema: ...,
messages: [
{
role: "system",
content: ...,
},
{
role: "user",
content: ...,
experimental_providerMetadata: {
anthropic: { cacheControl: { type: "ephemeral" } },
},
},
// Prompt tokens above here are not reported in the usage
{
role: "user",
content: ...
},
]
Description
When using the
experimental_providerMetadata
property for caching prompts with Anthropic provider, the token usage that is reported is inaccurate. The prompt tokens reported are only counting the non-cached part of the prompt, even though there is still (smaller) costs with using the cached section of the prompt.Code example
Additional context
No response