internetarchive / openlibrary

One webpage for every book ever published!
https://openlibrary.org
GNU Affero General Public License v3.0
5.07k stars 1.31k forks source link

Add support for solr spell checking ("Did you mean?") #6923

Open bicolino34 opened 2 years ago

bicolino34 commented 2 years ago

Currently, search is too strict with letters. Compare Безпека життєдіяльності and Безпека життєдіяльност. With just one letter missing ( і ) there are no results. This possibly can be solved by solr spell checking feature https://solr.apache.org/guide/8_11/spell-checking.html

Stakeholders

@cdrini

kushalShukla-web commented 2 years ago

Hey can I work on this

tfmorris commented 2 years ago

While the "Did you mean" spellcheck feature may be valuable in certain cases, I think for the stated use case using fuzzy search with an edit distance would be a better solution.

https://solr.apache.org/guide/8_11/the-standard-query-parser.html#fuzzy-searches

And, I'd consider this more a bug fix than a feature request. All modern search implementations handle this case with ease.

benbdeitch commented 6 months ago

Hello! I've worked on a few other Solr issues, and I'd love to give this one a shot, if you'd assign me to the task. Is the plan to go for the 'fuzzy search' approach, to leverage Solr's spell-checker, or both?

cdrini commented 6 months ago

There's a bit of research/experimenting here, I reckon. Thinking/experimenting with strategies for both might be the way to go!

As a note here are the updated docs links:

It might also be worth looking at some books to see if there are any strategies for this. I've had success with the O'Reilly books on solr in the past. Note that my public library gives me free access to the O'Reilly catalogue, so maybe check it that's the case for you!