Closed chriswilty closed 2 months ago
Found the cause: the function to produce a Retriever from a VectorStore takes a parameter k
corresponding to the number of documents to include in search context. See VectorStore.asRetriever.
As we have very few documents in SpyLogic, it is ok to include them all (for the current level) in the context. Note that if we had a more real-world number of documents in our VectorStore, we would probably want to use a different strategy such as Contextual Compression or MRL.
Test evidence:
Our Q&A bot searches the documents of the current level for answers, when the user asks a question. However, it is proving to be almost impossible to retrieve all the information a user wants in a single chat.
For example:
We need to investigate what's going on here, because if we cannot retrieve all the information we are looking for without any defences enabled, then Sandbox level will prove to be a frustrating experience.
Places to start:
Stick a load more logging in place to begin with, and see what each bot chat or tool call returns.
Acceptance Criteria GIVEN I am on Sandbox level AND no defences are enabled WHEN I ask the bot for salaries of all known employees THEN all nine entries from file
backend/resources/documents/common/management_info.csv
are given in the chat responseAs bot responses are non-deterministic, you might need to check this a few times to see if you can get it to reveal all nine salaries. An alternative might be to ask it for the names of all employees, to see if it returns them all, and then to ask for the salaries of each of those.