Open fwollatz opened 4 years ago
Currently I limit the Levenshtein-Distance to 2 for all languages, including German. The reason is that for longer words (say 10 characters or longer) getting past 2 creates a very long list of possibles to check and the system slows down. It is feasible to allow larger Levenshtein-Distances and I would appreciate any help or pull requests that make that possible.
I think it's possibly necessary to shift from creating all candidates to O(n^2) loop on the vocabulary with some efficient variant (Mbleven, for example) of edit distance. Then the complexity will be (~) a function that depends more on the vocabulary size rather than the edit distance.
Though, from experience - I have to say it does not scale well (but better than creating all candidates, at least for long words).
It is not possible to correct Words, that have a higher Levenshtein-Distance than 2. (At least in German).
A parameter to change this would be much appreciated.