Closed Jonny-GM closed 1 year ago
Thanks for the thorough thinking for this PR! 👌🏾
what's the reasoning behind defiltering the search query in the first place? keeping privacy about the full file-path?)
The idea to defilter the search query once the filtered results have been retrieved from your knowledge base is because:
This logic is unfortunately mostly based on ad-hoc testing rather than proper benchmarks. And it will create issues when you ask general questions like "what does yesterday's note say" for the exact reasons you inferred.
Looking at filename for the date filters makes sense
Passing the user query with filters to chat model needs to be tested out. The larger models may not be thrown of by the query filter format now. We can run the Khoj chat quality tests to see if it doesn't degrade its capabilities
Suppose we have a scenario where we have notes such that the date of the note is in the filename; this is prevalent and is for example the default behavior of the Obsidian Daily Notes feature. Chatting using a simple chat message such as "what does yesterday's note say" then fails to give relevant results, for 2 reasons:
This PR resolves these such that the simple chat query works as expected.