Searching for partial words fails

darktable-org / dtdocs

darktable user manual

GNU General Public License v3.0

73 stars 74 forks source link

Searching for partial words fails #72

Closed elstoc closed 3 years ago

elstoc commented 3 years ago

If I search for "demo" on the search page, it says "Found 9 matches:" but does not display them. If I type a bit more ("demosa") the matches suddenly appear.

paperdigits commented 3 years ago

Not sure why that is... I'll have to investigate.

paperdigits commented 3 years ago

@elstoc I think I've fixed this. Let me know if it meets your expectation now.

elstoc commented 3 years ago

Still a bit odd I'm afraid. Searching for "dem" brings up a list that doesn't include demosaic. Searching for "demo" and "demos" shows nothing and then searching for "demosa" brings a list that matches demosaic. What's the intended behaviour here?

elstoc commented 3 years ago

Searching for "focus peaking" doesn't include the documentation for the "focus peaking" module in the results. While typing the focus peaking module pops in an out of the results...

foc, focu, focus = in
focus p = out
focus pe, focus pea = in
focus peak = out
focus peaki, focus peakin = in
focus peaking = out

elstoc commented 3 years ago

I'm pretty sure this is caused by the fuzzy search functionality. If I change line 83 of themes/hugo-bootstrap-bare/assets/js/app.js from search(fuzzyQuery) to search(query) most of the issues (ok some of them) seem to go away.

While we're there, the Submit button just seems to reset the page - could it be removed?

elstoc commented 3 years ago

I've looked at how search matches are scored in lunr.js and it seems very difficult to tune it properly.

The wildcard search fails when an exact term is found (so "demosaic*" doesn't match "demosaic").

The fuzzy matching finds too many false matches, leading to entirely unreasonable matches (so searching for "base curve" gives the "culling" page twice the score of the "base curve" page, presumably because it's finding lots of fuzzy matches on that page).

It fails (I think) to give a higher score to pages where both terms match, again perhaps because of fuzzy matching.

The only reasonable solution I can think of is to build our index more intelligently.

matt-maguire commented 3 years ago

Sounds like just a simple matter of coding up a Baysean filter and training it to avoid these unreasonable matches...

elstoc commented 3 years ago

I think the currently-implemented search is acceptable for 3.4 and once we've migrated dtorg to hugo we should reconsider whether to stay with lunrjs or perhaps move the whole site to something else and/or change our indexing strategy.

paperdigits commented 3 years ago

Yes, that sounds good.

On December 1, 2020 3:54:56 AM PST, Chris Elston notifications@github.com wrote:

I think the currently-implemented search is acceptable for 3.4 and once we've migrated dtorg to hugo we should reconsider whether to stay with lunrjs or perhaps move the whole site to something else and/or change our indexing strategy.

-- You are receiving this because you were assigned. Reply to this email directly or view it on GitHub: https://github.com/darktable-org/dtdocs/issues/72#issuecomment-736503659