BerriAI / litellm

Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]
https://docs.litellm.ai/docs/
Other
13.79k stars 1.62k forks source link

[Bug]: LiteLLM returns 500 in case of Quota exceeded for anthropic-claude-3-haiku #4259

Closed kmyczkowska-hypatos closed 4 months ago

kmyczkowska-hypatos commented 4 months ago

What happened?

LiteLLM returns 500 error code in case of Quota exceeded for anthropic-claude-3-haiku. In this case 429 should be returned (as it is in the inner response).

Relevant log output

InternalServerError. Cause: Error code: 500 - {'error': {'message': "Error code: 429 - {'error': {'code': 429, 'message': 'Quota exceeded for aiplatform.googleapis.com/online_prediction_tokens_per_minute_per_base_model with base model: anthropic-claude-3-haiku. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/generative-ai/quotas-genai.', 'status': 'RESOURCE_EXHAUSTED'}}", 'type': None, 'param': None, 'code': 500}}

version: main-v1.37.20

Twitter / LinkedIn details

No response

ishaan-jaff commented 4 months ago

working on a fix

ishaan-jaff commented 4 months ago

Fixed here @kmyczkowska-hypatos https://github.com/BerriAI/litellm/pull/4263

ishaan-jaff commented 4 months ago

@kmyczkowska-hypatos any chance we can hop on a call ? I'd love to learn how how we can improve litellm for you.

Sharing a link to my cal for your convenience: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-cha