kentcdodds / match-sorter

Simple, expected, and deterministic best-match sorting of an array in JavaScript
https://npm.im/match-sorter
MIT License
3.73k stars 129 forks source link

Default match broken for cyrillic/russian because of incorrect diactrics removal #57

Closed glebtv closed 5 years ago

glebtv commented 5 years ago

I opened an issue with node-diactrics but it was not updated for 4 years, so opening an issue here too. https://github.com/nijikokun/Diacritics.js/issues/6

Relevant code or config

console.log(diacritics.clean("л"))

л Л is a lower/upercase case cyrillic/russian L and should not be replaced with latin n when matching.

This breaks case-insensitive search in match-sorter with default settings for any cyrillic text which has this letter.

node-diacritics also has this problem: https://github.com/andrewrk/node-diacritics/issues/32

but remove-accents does not: https://github.com/tyxla/remove-accents

Example word which does not match: "Лед"

What you did:

const list = ["Лед", "лед"];
const filtered = matchSorter(list, "л");

What happened:

["лед"], both items should match

Reproduction repository:

https://codesandbox.io/s/r76w8oqo8o

Problem description:

Suggested solution:

kentcdodds commented 5 years ago

Hi @glebtv.

Thanks for the report. If you have a better diacritics library, please make a PR to swap with the new one and I'll give it a look.

Also, you can probably use your bundler's configuration to alias the diacritics module with a custom one which would be even better than "allow user to bring his own diactrics library" because it would allow you to avoid including the diacritics package in your bundle as well.

I'll look forward to a PR :)

kentcdodds commented 5 years ago

:tada: This issue has been resolved in version 3.0.0 :tada:

The release is available on:

Your semantic-release bot :package::rocket: