languagetool-org / languagetool

Style and Grammar Checker for 25+ Languages
https://languagetool.org
GNU Lesser General Public License v2.1
12.05k stars 1.38k forks source link

Correct spelling of the word is not showing up in the suggestion box #922

Open patchling opened 6 years ago

patchling commented 6 years ago

The correct spelling of the word is not showing up in the suggestion box. Maybe grammar or definition problems?

procedure - the word that I am looking for, got this by entering the word into the DuckDuckGo search bar.

procuder - the letters that I used.

This is the sentence that I used it in: If a person wants something from me the procuder is simple: Approach me and make an offer.

With a little copy & paste this sentence does just fine. If a person wants something from me the procedure is simple: Approach me and make an offer.

Oh, and please keep up the good work with LT, I need it so very much.

danielnaber commented 6 years ago

These words are too far apart, you can check that here: http://www.convertforfree.com/levenshtein-distance-calculator/ - the distance is 3, but to keep LT fast, we stop searching at distance 2 once we have found words.

dpelle commented 6 years ago

@danielnaber wrote:

These words are too far apart, you can check that here: http://www.convertforfree.com/levenshtein-distance-calculator/ - the distance is 3, but to keep LT fast, we stop searching at distance 2 once we have found words.

Does the Levenshtein distance threshold depend on the word length? A threshold of 2 might be too large for a short word of say fewer than 4 letters. But a distance of 2 or even 3 may be quite acceptable for a long word of say more than 9 letters. Intuitively, I would use something like that

word    Levenshtein
length  threshold
======  ===========
<=4     1
5       2
6       2
7       2
8       2
>=9     3

Although, it'd have to be more complex as errors with combination of letter close to each other should could as smaller penalties than other kinds of errors. For example:

danielnaber commented 6 years ago

Does the Levenshtein distance threshold depend on the word length?

Yes, in a simple way:

https://github.com/languagetool-org/languagetool/blob/master/languagetool-core/src/main/java/org/languagetool/rules/spelling/morfologik/MorfologikSpellerRule.java#L185