zadam / trilium

Build your personal knowledge base with Trilium Notes
GNU Affero General Public License v3.0
27.37k stars 1.92k forks source link

(Bug report) Word-wise matching in search #3823

Open behold-research opened 1 year ago

behold-research commented 1 year ago

Trilium Version

0.59.3

What operating system are you using?

Ubuntu

What is your setup?

Local (no sync)

Operating System Version

Ubuntu Desktop 20.04

Description

Searching for "ego" in double quotes returns notes with the word 'category' in it, even though full quotes are supposed to guarantee exact matching.

Error logs

No response

zadam commented 1 year ago

Hmm, the matching is not "word aware". Quotes are actually meant for the opposite case, like for searching whole sentences, e.g. "hello world".

behold-research commented 1 year ago

Thanks for clarifying this. I guess what got me confused was the wording on the search syntax helper page: https://github.com/zadam/trilium/wiki/Search

rings tolkien - fulltext search, this will try to find notes which have anywhere words "rings" and "tolkien"

Coming from an information retrieval background this implies to me that indexing tokenises the text into words and searches are performed against that index. I guess there are some partial workarounds, e.g.: note.title *=* ego and not (note.title *=* category) but what if a desired result has both? That case can still be handled by boolean logic but it's no longer a search that can be issued quickly. Just my 2 cents. Many thanks for all the work on Trilium!

zadam commented 1 year ago

Yeah, I mean it's not an unreasonable expectation. Word tokenization happens on the search string, but not on the searched documents.

I will reopen this, since I think it's a reasonable feature request / bug. (although I don't plan to address it in the near future).