nfdi4plants / Swate

Excel Add-In for annotation of experimental data and computational workflows.
https://swate-alpha.nfdi4plants.org
MIT License
31 stars 6 forks source link

[BUG] Improve search #95

Closed Brilator closed 3 years ago

Brilator commented 3 years ago

Describe the bug I'm sometimes a bit confused by the search results.

To Reproduce a) In "Annotation building block selection"

  1. Search for the treatment "red light"
  2. See the color "light red" as first result.

b) In "Advanced Search"

  1. Search by "Term name keywords": "red light exposure"
  2. The exact match "red light exposure" PECO:0007207 is the 9th hit

c) In "Advanced Search"

  1. Search by "Name must contain": "red light exposure"
  2. The exact match "red light exposure" PECO:0007207 is the 5th hit

Expected behavior The search results should be displayed in the order: exact match > same order of keywords (see (a)) > match as many keywords as possible > match any

Freymaurer commented 3 years ago

Maybe i'll start by giving a bit more insight into how we order our search results:

We use a variant of a search algorithm called "sorensen dice" which compares small subelements of two text strings we want to compare. The more equal sub elements both strings contain compared to their combined length the better the score.

This is why "red light" matches "light red" slightly better than "red light exposure".

But as you suggested we will consider tweaking this search a bit to increase the score for "exact matches".

Your criticism of the advanced term search is absolutely justified and I have noticed that the soerensen dice algorithm is not applied to this search. I added it and tested it for the example and now "red light exposure" is hit number 1.

While we will discuss tweaking the soerensen dice, the change for advanced term search will be live in version 0.2.1.

Brilator commented 3 years ago

Thanks. And my bad - I thought there was also an exact match "red light" (not "red light exposure").

Freymaurer commented 3 years ago

No worries! Just to make sure i checked the database and did in fact not find a term with the name "red light". Did you expect such a term?

Brilator commented 3 years ago

No no. I just thought I saw it earlier and was confused about "light red" > "red light".

Freymaurer commented 3 years ago

@Brilator By the way, do you know that you can still answer on closed issues? I am always closing issues after i consider them solved and i am afraid you'll think i am just shutting you up. You can always answer to an closed issue if you don't feel like it is solved satisfactorily.

Brilator commented 3 years ago

Sure, thanks. I just did (react with thumbs-up not comment) earlier this morning on closed issues.