Urgent Help Needed: 529-overloaded-error

Description

For a chat bot that needs to go live in the next 24 hours we often get the following Error the middle of the conversation: {“type”:“error”,“error”:{“details”:null,“type”:“overloaded_error”,“message”:“Overloaded”}}.

Using “@anthropic-ai/sdk”: “^0.27.3"

Our usual input token length per minute can be around 5k-30k per minute. Implemented retry solution but still get it, also sometimes long delays. API calls from us can be 2-3 per conversation and expecting at peak to have 3-4 conversations simultaneously. Will really appreciate urgent feedback. Also, can this be resolved by implementing prompt cache? Or any other techniques?

Code example

No response

AI provider

No response

Additional context

No response

vercel / ai