LlamaIndex chat engines support streaming responses. It would be a small UX improvement if rags could support streaming the engine's responses to the Streamlit frontend such that users don't have to wait until the entire response is generated.
The only issue is that .stream_chat uses async functions. But Streamlit runs in a separate thread that doesn’t have an event loop by default. To make it work, the implementation will need to create an event loop and run the .stream_chat call inside it.
🔍 Description
LlamaIndex chat engines support streaming responses. It would be a small UX improvement if rags could support streaming the engine's responses to the Streamlit frontend such that users don't have to wait until the entire response is generated.
The only issue is that
.stream_chat
uses async functions. But Streamlit runs in a separate thread that doesn’t have an event loop by default. To make it work, the implementation will need to create an event loop and run the.stream_chat
call inside it.Happy to submit a PR for this!