Closed jamesbraza closed 6 days ago
hi @jamesbraza - would you pass this with extra_headers ?
Maybe I am misunderstanding your question, but I think the retry-after
header is in the response from Anthropic to LiteLLM, so end users don't pass anything. Does that makes sense?
oh you just want to access the response header from anthropic ?
So the retry-after
response header from Anthropic tells LiteLLM how long to wait before retrying an API call. Currently, LiteLLM doesn't take that parameter into account.
oh you just want to access the response header from anthropic ?
I am not looking to access the response header, the request is that LiteLLM's Anthropic code should be using their header when calculating how long to wait before a retry. Does that makes sense now?
I am using litellm.acompletion
with some messages
, max_retries=3
, and an Anthropic model.
how do you want litellm to use retry-after in the completion calls ?
It seems like what happens is:
max_retries
gets considered an optional_params
anthropic_chat_completions.completion
retry-after
header that tells how long clients should wait before retryingretry-after
headerWhat I want to happen is at step 5: LiteLLM to take into account the retry-after
header and sleep that duration, before retrying.
For reference, it looks like LiteLLM's azure_dall_e_2
is actually using the retry-header
: https://github.com/BerriAI/litellm/blob/v1.41.3/litellm/llms/custom_httpx/azure_dall_e_2.py#L53. Though I am not sure a custom HTTP transport layer is necessary to respect retry-after
, it seems overkill
The Feature
Anthropic has a
retry-after
header in their response when one hits a 429 Too Many Requests error: https://docs.anthropic.com/en/api/rate-limits#response-headersIt looks like
litellm==1.40.25
's Anthropic code directly useshttpx
for POSTs: https://github.com/BerriAI/litellm/blob/v1.40.25/litellm/llms/anthropic.py#L182And from what I have read, it looks LiteLLM doesn't utilize the
retry-after
header anywhere in the normalcompletion
call stack.For reference, it seems LiteLLM's router entity does support
retry-after
: https://github.com/BerriAI/litellm/blob/v1.40.25/litellm/utils.py#L5457Motivation, pitch
Can we support the
retry-after
header? It will enable retrying of Anthropic compliant with their APITwitter / LinkedIn details
No response