microsoft / kernel-memory

RAG architecture: index and query any data using LLM and natural language, track sources, show citations, asynchronous memory patterns.
https://microsoft.github.io/kernel-memory
MIT License
1.34k stars 252 forks source link

Add streaming response (text only) as discussed in #625 #652

Closed chaelli closed 4 weeks ago

chaelli commented 4 weeks ago

Motivation and Context (Why the change? What's the scenario?)

High level description (Approach, Design)

dluc commented 4 weeks ago

hi @chaelli there's already a PR in progress for streaming responses here https://github.com/microsoft/kernel-memory/pull/400

chaelli commented 4 weeks ago

@dluc I know - that's why I made it "draft" - didn't know this will also list it here. I just wanted to quickly show the "simple" way. Will close this. Also added a comment on the discussion #625 just hope we can get some streaming solution - for direct usage in UIs this is very important