Open asfimport opened 12 years ago
Erik Hatcher (@erikhatcher) (migrated from JIRA)
This patch to DirectSpellChecker does the trick (using accuracy=0.8 or less in the description example):
- FuzzyTermsEnum e = new FuzzyTermsEnum(terms, atts, term, editDistance, Math.max(minPrefix, editDistance-1), true);
+ FuzzyTermsEnum e = new FuzzyTermsEnum(terms, atts, term, editDistance, minPrefix, true);
In a conversation with Robert Muir, we agreed that this, rather, should keep the default that restricts to minPrefix=1 when editDistance=2, but made optional to allow using a minPrefix=0.
Robert Muir (@rmuir) (migrated from JIRA)
yeah i think we should add an option to disable this heuristic.
It was basically a perf/relevance thing (in general edits of 2, esp considering a transposition is a single edit, along wotj minPrefix of 0 can yield surprisingly irrelevant stuff).
But if someone wants that... let them do it.
Sascha Szott (@saschaszott) (migrated from JIRA)
Should we at least add a short note to the reference guide that explains the effect of setting minPrefix effectively to 1 (even if the user set it to 0) in case no suggestions with an edit distance of 1 are available in the term dictionary?
DirectSpellChecker currently mandates a minPrefix of 1 when editDistance=2. This prohibits a query of "nusglasses" from matching the indexed "sunglasses" term.
Granted, there can be performance issues with using a minPrefix of 0, but it's a risk that a user should be allowed to take if needed.
Migrated from LUCENE-4500 by Erik Hatcher (@erikhatcher), updated Feb 28 2019