Added support for holding conversations in chat (LLM "remembers" what has been said).
How
Assistant system has been refactored to facilitate two new concepts: pre-made "pipelines", and stream "inserts", as well as a context object that is passed around for the streams to share data.
Pipelines are pre-made classes that produce a Stream that optionally can modify internal state for the entire stream chain.
A "RAG scoring" pipeline is included that takes the user query and populates the stream state with the scored vectordb results (same as before). Example of a future pipeline could be a different strategy for fetching RAG results.
A Stream insert produces and inserts messages within a stream. Included is a "history" insert which inserts verbatim the previous user questions and LLM answers in order. Example of a future insert could be an insert which summarizes the history instead.
Conversation history
Handled similarly to other IAsyncRepo sources. A chat history entry is generated when the chat endpoint is called and sent to the client via SSE. The client then includes this conversation id for future question in order to continue on that specific conversation.
NOTE: BREAKING CHANGES
Assistant template structure has been changed. Previously added project.assistant entries are invalid. Remove project (and let setup re-run) to fix, or update templates manually in db.
How to test
Right now, same parity as before, with added conversation. So
Migrate templates
Make sure chat still works
Check conversation history works by asking a relative second question, example:
User: Var ligger centralstationen?
LLM: Centralstationen i Helsingborg, som heter Knutpunkten, ligger vid...
User: Hur tar jag mig dit från söder?
LLM: Om du befinner dig söder om Helsingborg och vill ta dig till Knutpunkten...
Added support for holding conversations in chat (LLM "remembers" what has been said).
How
Assistant system has been refactored to facilitate two new concepts: pre-made "pipelines", and stream "inserts", as well as a context object that is passed around for the streams to share data.
Pipelines are pre-made classes that produce a Stream that optionally can modify internal state for the entire stream chain. A "RAG scoring" pipeline is included that takes the user query and populates the stream state with the scored vectordb results (same as before). Example of a future pipeline could be a different strategy for fetching RAG results.
A Stream insert produces and inserts messages within a stream. Included is a "history" insert which inserts verbatim the previous user questions and LLM answers in order. Example of a future insert could be an insert which summarizes the history instead.
Conversation history
Handled similarly to other IAsyncRepo sources. A chat history entry is generated when the chat endpoint is called and sent to the client via SSE. The client then includes this conversation id for future question in order to continue on that specific conversation.
NOTE: BREAKING CHANGES
Assistant template structure has been changed. Previously added project.assistant entries are invalid. Remove project (and let setup re-run) to fix, or update templates manually in db.
How to test
Right now, same parity as before, with added conversation. So