wolfgarbe / SymSpell

SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm
https://seekstorm.com/blog/1000x-spelling-correction/
MIT License
3.12k stars 284 forks source link

Issue with word segmentation #52

Closed kumar9536 closed 5 years ago

kumar9536 commented 5 years ago

https://github.com/wolfgarbe/SymSpell/blob/8b71ac07dd64e1c9a70c12ab21c83dec6f904b80/SymSpell/SymSpell.cs#L963

I have a doubt how the code for word segmentation will be able to segment the given example:

Input : thequickbrownfoxjumpsoverthelazydog Output : the quick brown fox jumps over the lazy dog

because, in outer loop "j" is iterating from 0 to "input.length" and in inner loop "i" is varying from 1 to "imax" assuming maxSegmentationWordLength is large enough and imax is alwayas taking the value (input.length - j) so, as j is increasing imax is decreasing and the scope of substring that we take i.e. "part" will reduce. so, my concerns are

Please assist here. Thanks a lot.

wolfgarbe commented 5 years ago

Have a look at the graphical illustration of the algorithm: Algorithm

kumar9536 commented 5 years ago

Thanks a lot for so detailed explanation.