Cannot get consistent results from "Chat with document"

mtomas7 commented 1 month ago

Setup:

Windows Desktop App 1.6.4
Open new work space, set the default settings with local Qwen2 7b instruct fp16 model for Chat and Agents
-Chat Mode: Chat
Use Paperclip icon or drag and drop the PDF document into chat window.
Icon indicates that Document was uploaded.
Document is not listed under main Workspace document upload screen (I assume, this screen is specifically dedicated to RAG)
I ask some questions, but answers sometimes would totally miss the mark, it's like parts of the short 3 page document were not included.
After first interaction the Document upload icon disappears and it is not clear if that document is still part of the conversation or not.
@agent cannot see that document was dropped into conversation (perhaps by design)
Even after uploading document via main Workspace upload screen (I assume RAG) (checked that vector DB was updated) and using prompt from the documentation:
@agent what documents can you see?
Agent would reply:
"As an AI language model, I don’t have the capability to “see” or access files directly from your device or any external storage because I operate within a text-based interface without direct file access permissions. My role is to provide information and assistance based on my training data and the questions you ask me. If you need help managing documents or understanding how to organize them, I’d be glad to offer advice or guidance!"

So first, there is no clear distinction if drop-in docs and RAG upload docs are treated in a same manner and if drop-in docs are added to the Workspace vector DB.

Compared the drop-in doc feature with LM Studio, when dropping file in, it evaluates if there is enough memory to load whole document and if yes, it does that. Answer quality is much better. Perhaps it would be possible to implement this memory check in AnythingLLM too?

ValfarDeveloper commented 1 month ago

Having the same here, using Milvus, could be related to the model embeddings?

timothycarambat commented 1 month ago

Compared the drop-in doc feature with LM Studio, when dropping file in, it evaluates if there is enough memory to load whole document and if yes, it does that. Answer quality is much better. Perhaps it would be possible to implement this memory check in AnythingLLM too?

This is because it injects the entire document into the context window. We have RAG with a vector db built in. LMStudio does not. The reason you get better results is because the whole document is in context!

We preserve context windows by default with RAG - but you can always replicate the same behavior with document pinning. Use that if the entire document needs to be injected. Otherwise, you can mess with the controls in this doc to get better RAG results.

https://docs.anythingllm.com/llm-not-using-my-docs

mtomas7 commented 1 month ago

@timothycarambat Thank you for the clarification! I'm trying to look at this from a practical perspective - an average user will not care so much about the internal process, eg. which embedding method was used, but will really care about the quality of the answer. I'm preparing to present this for new users and I know that some of them already have a habit to drop file into conversation as it is done in ChatGPT... Would it make a sense to do this pre-loading evaluation and suggest user to use document pinning if file is not very large. This way user will act as it is already "standard way" to interact with AI model and will get good results. Or perhaps doc. pinning should be always used if document is dropped into chat window? In case of a large file or multiple files, user would be guided to create RAG instance.

Thank you!

mtomas7 commented 1 month ago

And not to forget, looks like these points could be potentially improved in the app:

After first interaction the Document upload icon disappears and it is not clear if that document is still part of the conversation or not.
agent cannot see that document was dropped into conversation
Even after uploading document via main Workspace upload screen (I assume RAG) (checked that vector DB was updated) and using prompt from the documentation: @agent what documents can you see?
Agent would reply: "As an AI language model, I don’t have the capability to “see” or access files directly from your device or any external storage because I operate within a text-based interface without direct file access permissions. My role is to provide information and assistance based on my training data and the questions you ask me. If you need help managing documents or understanding how to organize them, I’d be glad to offer advice or guidance!"

Mintplex-Labs / anything-llm

Cannot get consistent results from "Chat with document" #2254