Azure-Samples / azure-search-openai-demo

A sample app for the Retrieval-Augmented Generation pattern running in Azure, using Azure AI Search for retrieval and Azure OpenAI large language models to power ChatGPT-style and Q&A experiences.
https://azure.microsoft.com/products/search
MIT License
5.57k stars 3.74k forks source link

How to define/exclude categories? #1715

Open amirj opened 1 week ago

amirj commented 1 week ago

There is an option in "Developer Settings" named "Exclude category"; could you help me to understand how does it work?

Is it possible to organise the underlying dataset into some categories and then exclude them from the search result?

zedhaque commented 1 week ago

When you run "prepdocs.sh" or "prepdocs.ps1" to build your index, you can pass a "category" parameter. This will allow the documents being indexed to have a respective category value. You can then use the "exclude category" developer settings in the frontend during your chat and ask sessions. You will need to manually type in the "category" values in the frontend - the ones you passed during indexing.

You need to modify "prepdocs.sh" or "prepdocs.ps1" - line 67 to be exact.

./.venv/bin/python ./app/backend/prepdocs.py './data/*' --verbose \

to something like

./.venv/bin/python ./app/backend/prepdocs.py './data/*' --verbose --category "$CONTENT_CATEGORY" \

off course you need to pass your $CONTENT_CATEGORY dynamically as you are indexing various file. One option would be structure the data folder where folder name is the content_category.

Hope this helps.