hiddentao / fast-levenshtein

Efficient Javascript implementation of Levenshtein algorithm with locale-specific collator support.
MIT License
596 stars 56 forks source link

Chinese bad recognition #14

Closed Adavo closed 8 years ago

Adavo commented 8 years ago

console.log(window.Levenshtein.get('你好世界', '你好'));
=> get 6, but is no what I am supposed to get, isn't it ? Or if yes then I don't understand how is calculated for simplified chinese character.

hiddentao commented 8 years ago

I just ran this locally and got 2.

Adavo commented 8 years ago

Found it why: if the file is encoded UTF8 (without BOM) then is 6, with is 2. Didn't pay attention to that.

hiddentao commented 8 years ago

Thanks. Pushed v2 with a slight performance update.