Closed Manouchehri closed 2 months ago
This isn't rate limiting, it's a critical error that seems to happen during long requests.
Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/litellm/utils.py", line 9633, in __anext__
async for chunk in self.completion_stream:
File "/usr/local/lib/python3.11/site-packages/litellm/llms/vertex_httpx.py", line 1429, in __anext__
raise RuntimeError(f"Error parsing chunk: {e},\nReceived chunk: {chunk}")
RuntimeError: Error parsing chunk: Expecting property name enclosed in double quotes: line 1 column 2 (char 1),
Received chunk: {
what should happen here? it looks like invalid json was received - {
and can you share a way for us to repro this @Manouchehri
Yeah that's a good point. Sorry for not replying, I'm not quite sure why Vertex AI is doing this (or if it's a bug somewhere else).
It's really hard to trigger this bug. Like, I have to spend between $100 to $1,000 in API requests before I trigger it every so often. It happens more with larger slow prompts.
What happened?
Seems like with #4459, we aren't handling the cases where the response isn't SSE JSON. Not exactly sure what's happening atm. My guess is rate limiting error messages?
Relevant log output
Twitter / LinkedIn details
https://twitter.com/DaveManouchehri