BerriAI / litellm

Python SDK, Proxy Server to call 100+ LLM APIs using the OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]
https://docs.litellm.ai/docs/
Other
12.28k stars 1.43k forks source link

[Bug]: Unable to continue async streaming after cancellation #5359

Closed jamesleeht closed 3 weeks ago

jamesleeht commented 3 weeks ago

What happened?

Here is an example usage:

chunks = []
try:
    async for chunk in stream_resp:
        text = chunk.choices[0].delta.content or ""
        yield text
        chunks.append(chunk)
# If coroutine cancelled, we still want to collect all chunks.
except asyncio.CancelledError:
    async for chunk in stream_resp:
        chunks.append(chunk)

response = litellm.stream_chunk_builder(chunks, messages=messages)

This doesn't seem to work. The use case would be to save the rest of the stream when HTTP responses are cancelled in certain frameworks like Django.

Is there any way to make this work with LiteLLM?

Relevant log output

No response

Twitter / LinkedIn details

No response

krrishdholakia commented 3 weeks ago

Hi @jamesleeht

If coroutine cancelled, we still want to collect all chunks.

except asyncio.CancelledError: async for chunk in stream_resp: chunks.append(chunk)

If the call is cancelled, as long as the backend llm api call is cancelled this should still work. But if the api call is cancelled/task running the call is cancelled then i'm not sure how it would work

jamesleeht commented 3 weeks ago

If the call is cancelled, as long as the backend llm api call is cancelled this should still work. But if the api call is cancelled/task running the call is cancelled then i'm not sure how it would work

Hey @krrishdholakia, from my testing the second iteration over the generator inside CancelledError will run, but no new chunks will be produced. I believe this is because you cannot iterate over a generator twice in Python

krrishdholakia commented 3 weeks ago

that makes sense. so what would you expect litellm to do here? @jamesleeht

jamesleeht commented 3 weeks ago

that makes sense. so what would you expect litellm to do here? @jamesleeht

Could there be a way to continue the stream where it left off? Maybe to save the progress of the stream.

Currently the alternative would be to regenerate a response from scratch instead of continuing the paused stream.

Not sure if this is possible though, and I understand if it isn't :)

thanks for the quick response

krrishdholakia commented 3 weeks ago

if the stream is cancelled, then i don't think there's a way for us to resume it (assuming the connection already ended).

If you find a way to do this, we'd welcome a pr here!

Closing for now, as this seems like a no-op (currently).