Open K-Schubert opened 2 months ago
I don't know if it is necessary to display the topic categories in the chat reply. Perhaps in future there will be an option to select a language in the settings. Then we would filter the answers for question completion, especially in the expert database. The same could then be done for certain subject categories, e.g. if you don't work in the area of child benefit, you don't need to see any suggested questions from the area of child benefit.
Inspiration:
# category extraction
import urllib.parse
def _extract_category(url, category_position_in_path=2):
"""Extract the category from the URL of a webpage."""
parsed_url = urllib.parse.urlparse(url)
path = parsed_url.path
path = path.split('/')
return path[category_position_in_path] if path and len(path) > 2 else None
print(_extract_category('https://faq.bsv.admin.ch/de/familienzulagen/wann-gilt-ein-jugendlicher-als-ausbildung'))
Yes I think that's a good idea. The more data/db filtering we can do before doing RAG/autocomplete the better and the higher the quality of the answer. FYI: Research in RAG has shown increased performance with "semantic routing" which is the same concept but with embeddings of queries instead of hard category filtering.
From FAQ scraping.