Open mmabrouk opened 11 months ago
Here is a simple example of an implementation of streaming in FastAPI and react: https://medium.com/@hxu296/serving-openai-stream-with-fastapi-and-consuming-with-react-js-part-1-8d482eb89702
I think the challenge would be the design of the features. Some early thoughts here:
We can add a parameter to @entrypoint(streaming=True)
and expect the user to yield the message instead of returning it, then encapsulate the response in a StreamingResponse
with FastAPI.
Some questions I would have:
@span
decorator handle streaming functions? Or langchain ones that streams?@aybruhm says:
The frontend would know by the media_type of the endpoint which would be either:
text/event-stream
or, application/octet-stream
Litellm has solved that for us. See here.
For users that will be using our span decorators, we can simple investigate how litellm gets the final streaming chunk and do the same.
Is your feature request related to a problem? Please describe. Using llms without streaming is slow.
Describe the solution you'd like Add a feature to stream outputs to the SDK
Tasks
AGE-277