danswer-ai / danswer

Gen-AI Chat for Teams - Think ChatGPT if it had access to your team's unique knowledge.
https://docs.danswer.dev/
Other
9.77k stars 1.09k forks source link

[CONFLUENCE] Archive pages #789

Open mboret opened 7 months ago

mboret commented 7 months ago

Hi,

Danswer considers the Confluence archive page to be a "standard" one. So, to answer a question, it can use it. It's not a good thing IMO.

I don't know if the best way is to filter the archive pages on the indexer side or downrank the document from the sources selection...

sjakos commented 7 months ago

I would filter on the indexer, personally. Maybe an optional setting to include the status? The call on the confluence_client can take a list of statuses so we could support different types of filters if you wanted to include the other pages as separate knowledge sets.

sjakos commented 7 months ago

I tried it with a list of statuses for a quick test and it didn't like the list, if status argument is set to the string "current" the archived pages won't be included.