As a user, I want to send a request to a FastAPI endpoint and have the response from the LLM be streamed directly to the frontend. The follow goes as such:
User makes post request to the endpoint
backend does some processing, and calls the LLM
The response from the LLM is directly streamed to the user in the frontend via StreamingResponse.
I'm currently taking the response from the LLM, chunking and then streaming it but this is slow, and I need to speed it up.
I'm very new to using Lanarky, and haven't been able to make this work with the current available api docs. If this is an available feature I would appreciate any guidance on how to achieve this.
As a user, I want to send a request to a FastAPI endpoint and have the response from the LLM be streamed directly to the frontend. The follow goes as such:
I'm currently taking the response from the LLM, chunking and then streaming it but this is slow, and I need to speed it up.
I'm very new to using Lanarky, and haven't been able to make this work with the current available api docs. If this is an available feature I would appreciate any guidance on how to achieve this.