Mintplex-Labs / anything-llm

The all-in-one Desktop & Docker AI application with full RAG and AI Agent capabilities.
https://useanything.com
MIT License
17.66k stars 1.9k forks source link

RAG implementation with user-roles getting different results? #1551

Open thebaldgeek opened 1 month ago

thebaldgeek commented 1 month ago

How are you running AnythingLLM?

Docker (local)

What happened?

Trained system from about 100 pdf and text documents. Using default and built-in database. Windows 11. Docker image pulled around 18 hours ago.

Admin user gets answers from trained docs as expected. Non-admin users get raw Ollama llama3.0 LLM answers with correct citations. Tested with exact same questions and other basic questions. 100% of the time the Admin user gets the correct answer text that shows the trained docs have been accessed and 4 citations contain parts of the answer text as expected. Default user gets raw LLM answer (often comically wrong) with 'correct' 4 citations that contain question text listed (for those questions that are the same).

Tested in Firefox, Chrome and Edge, both regular browser and incognito sessions. User is set as 'Default'. User is added to workspace.

Work around is to have all users log in as Admins and requested not to change any settings.

Are there known steps to reproduce?

No response

timothycarambat commented 1 month ago

This is not how the implementation of the RAG system works, or should work. Everyone has the same access to documents in the same workspace, so sending a prompt in a workspace the only user-specific information is the current chat/thread history sent by that specific user.

I think there is either a misunderstanding or some other detail missing from this issue that is leading to worse RAG performance at this time.

This part:

Default user gets raw LLM answer (often comically wrong) with 'correct' 4 citations that contain question text listed (for those questions that are the same).

is quite odd as the citation should not exist at all unless it was explicitly used in the response generation 🤔