BerriAI / litellm

Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]
https://docs.litellm.ai/docs/
Other
12.62k stars 1.47k forks source link

[Bug]: Vertex AI streaming fails with weird error sometimes #4500

Closed Manouchehri closed 2 months ago

Manouchehri commented 3 months ago

What happened?

Seems like with #4459, we aren't handling the cases where the response isn't SSE JSON. Not exactly sure what's happening atm. My guess is rate limiting error messages?

Relevant log output

litellm.proxy.proxy_server.async_data_generator(): Exception occured - litellm.APIConnectionError: Error parsing chunk: Expecting property name enclosed in double quotes: line 1 column 2 (char 1),
Received chunk: {
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/litellm/llms/vertex_httpx.py", line 1415, in __anext__
    json_chunk = json.loads(chunk)
                 ^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/json/decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
               ^^^^^^^^^^^^^^^^^^^^^^
json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)

During hand...

Twitter / LinkedIn details

https://twitter.com/DaveManouchehri

Manouchehri commented 3 months ago

This isn't rate limiting, it's a critical error that seems to happen during long requests.

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/litellm/utils.py", line 9633, in __anext__
    async for chunk in self.completion_stream:
  File "/usr/local/lib/python3.11/site-packages/litellm/llms/vertex_httpx.py", line 1429, in __anext__
    raise RuntimeError(f"Error parsing chunk: {e},\nReceived chunk: {chunk}")
RuntimeError: Error parsing chunk: Expecting property name enclosed in double quotes: line 1 column 2 (char 1),
Received chunk: {
krrishdholakia commented 2 months ago

what should happen here? it looks like invalid json was received - {

and can you share a way for us to repro this @Manouchehri

Manouchehri commented 2 months ago

Yeah that's a good point. Sorry for not replying, I'm not quite sure why Vertex AI is doing this (or if it's a bug somewhere else).

It's really hard to trigger this bug. Like, I have to spend between $100 to $1,000 in API requests before I trigger it every so often. It happens more with larger slow prompts.