wolfgarbe / SymSpell

SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm
https://seekstorm.com/blog/1000x-spelling-correction/
MIT License
3.15k stars 299 forks source link

Support for different Encoding Style while loading Dictionary from a text file #37

Open RAJAT--PALIWAL opened 6 years ago

RAJAT--PALIWAL commented 6 years ago

Currently there's no support for various encoding style like ISO-8859-1 while reading data to create Dictionary. Having Encoding as optional argument will help load Accented characters also like å

wolfgarbe commented 6 years ago

SymSpell uses UTF-8, and UTF-8 supports accented characters like å.

If your text files are in a different encoding you could alway convert the text files to UTF-8 before consuming them in SymSpell: https://www.motobit.com/util/charset-codepage-conversion.asp https://linux.die.net/man/1/iconv

Nevertheless I can add an optional Encoding/Codepage parameter to LoadDictionary and CreateDictionary in the future.