Altinn / digdir-assistants

Generative AI assistants
MIT License
3 stars 0 forks source link

Use query understanding for RAG retrieval #8

Open bdb-dd opened 10 months ago

bdb-dd commented 10 months ago

Description

Updated 26.02.24: Even after having successfully addressed the retrieval ranking issues we had earlier, there are still many opportunities for improving retrieval for specific kinds of queries. As an example, a user may wish to qualify their search by defining specific filters, such as "updated recently", "sorted by version number" or "only open issues".

The "Query understand" strategy calls for using LLMs to generate the retrieval query itself, based on a combination of knowledge of the underlying search engine, the data schemas involved and potentially some function calling extensions.

## Evaluate
- [x] Evaluate traditional text indexing tools designed to deliver relevant results from free text queries
- [x] Test a specific free text query engine with a generic configuration suitable for our documentation data set
- [ ] [in-progress] Gather requirements for evaluating and tuning free text query performance
- [ ] [in-progress] Evaluate results and determine if there is a need to tune the configuration, for example for special handling of multiple languages, content that has been machine translated, recency, metadata
- [ ] https://github.com/Altinn/digdir-slack-bot/issues/49

Additional Information

No response

bdb-dd commented 10 months ago

First end to end test completed. Initial results look very promising.

Will likely require additional testing and content improvements to deal with issues related to certain topics.

Certain documents should probably be included in context regardless of search terms.

bdb-dd commented 10 months ago

Sent invitation to a broader group of people who can contribute with a varied set of user queries. Quickly finding examples where the first stage, extract search terms, is not as selective as it could be. A large number of search terms currently results in a smaller result set, sometimes including documents that are highly ranked for no apparent reason.

Have also tested asking GPT 3.5 for feedback on which of the supplied context documents were relevant, with good results. So one option would be to "pin" certain source documents, such that they are always included in the RAG context. The context length has varied significantly from query to query, sometimes exceeding 16K which is our current upper limit.