[FEAT]: Improve embed chat widget `query` mode backfilling for QA

How are you running AnythingLLM?

Docker (local)

What happened?

Hi,

Problem: The query mode inside the main application uses the chat history, but sadly the embedded widget only uses the chat history in chat mode, according to my tests.

Goal: It would be amazing to be able to reference to previous question (at least the last one) in query mode with the widget, too! In chat mode I usually get at some point hallucinations that are not from my embedded documents with open-source LLM models. Even if I try to tune them inside the system-prompt. Therefore, for our use case we would need query mode + chat history (best if it could even incorporate the number in the main app "chat history" of previous chats).

More Information: Under api/docs/#/Workspaces/post_v1_workspaceslugstream_chat I also saw this comment: Send a prompt to the workspace and the type of conversation (query or chat). Query: Will not use LLM unless there are relevant sources from vectorDB & does not recall chat history. Chat: Uses LLM general knowledge w/custom embeddings to produce output, uses rolling chat history.

But I am not sure if this has anything to do with the issue as it is working in the main app. Thanks and blessings!

Are there known steps to reproduce?

No response

Mintplex-Labs / anything-llm