Closed akvadrako closed 2 years ago
Hi @akvadrako , I integrated stemmers in custom applications using MiniSearch in the past, I do not recall the specific one but I can look it up.
That said, I want to share a word of caution: stemmers can help increasing recall, but they often make results confusing for the user, so I found myself relying more and more on fuzzy match instead. Here’s a few reasons why:
looking
gets stemmed to look
, searching for look
would give equal relevance (all else being equal) to documents containing the term look
or looking
. Fuzzy search would instead assign more relevance to the exact match.lo
, loo
, look
would find documents containing looking
, but searching for looki
, lookin
would find no result, then looking
would again find results (because it gets correctly stemmed).match
field in the results contains the actual terms in the document, which is not the case with stemming.Of course there are still legitimate use cases for stemming, but I felt it was useful to share my personal experience. I ended up replacing stemming with fuzzy match in more than a few applications in recent years, with better user experience.
Closing the issue for now, but feel free to comment further if necessary.
I'm looking at using minisearch in place of lunr, but one requirement I have is language support, specifically for English and Dutch.
I see there are a few npm packages with stemmers, stop words, etc. Has anyone had success integrating them with minisearch?