Investigate why Q&A bot will not find all hits in documents

chriswilty commented 2 months ago

Our Q&A bot searches the documents of the current level for answers, when the user asks a question. However, it is proving to be almost impossible to retrieve all the information a user wants in a single chat.

For example:

I ask the bot to tell me the salaries of all employees, knowing there is a document (management_info.csv) listing nine manager employees and their salaries.
Bot returns me a seemingly random selection of just three or four employees from the list.
I ask for others, and the bot insists that's all it knows about
If I keep insisting there are more, then sometimes an extra one or two will be found, but sometimes even fewer are retrieved, and each time the bot insists there are no more.
Sometimes it returns me just one, and insists there are no more.

We need to investigate what's going on here, because if we cannot retrieve all the information we are looking for without any defences enabled, then Sandbox level will prove to be a frustrating experience.

Places to start:

Does the Q&A bot actually return all the info we seek, but the chatbot then ignores that when generating its final response?
Are there any limits on searches that are configurable, and that are currently preventing all hits being returned, or all documents from being included in a search?

Stick a load more logging in place to begin with, and see what each bot chat or tool call returns.

Acceptance Criteria GIVEN I am on Sandbox level AND no defences are enabled WHEN I ask the bot for salaries of all known employees THEN all nine entries from file backend/resources/documents/common/management_info.csv are given in the chat response

As bot responses are non-deterministic, you might need to check this a few times to see if you can get it to reveal all nine salaries. An alternative might be to ask it for the names of all employees, to see if it returns them all, and then to ask for the salaries of each of those.

chriswilty commented 2 months ago

Found the cause: the function to produce a Retriever from a VectorStore takes a parameter k corresponding to the number of documents to include in search context. See VectorStore.asRetriever.

As we have very few documents in SpyLogic, it is ok to include them all (for the current level) in the context. Note that if we had a more real-world number of documents in our VectorStore, we would probably want to use a different strategy such as Contextual Compression or MRL.

chriswilty commented 2 months ago

Test evidence:

ScottLogic / prompt-injection

Investigate why Q&A bot will not find all hits in documents #902