Closed jove4015 closed 3 weeks ago
@farzher Is it possible that accents could be added again but as an option?
I've been using this for a while just off of my PR branch and when your comment came in, I realized that it was a little outdated, so I just rebased this off of v3.0.2. I'm not sure anyone will ever acknowledge the PR but just in case it's helpful to you.
it is helpful, thanks a lot! hoping @farzher will take a look and maybe merge it.
@farzher Is there any chance to merge it? It would be very helpful for all non-pure English users that have some wired letters in their language?
@farzher Having read your comments on other people's issues, I'm pretty sure you're not merging my PR because you don't like it. If you could just tell me what you don't like about it, what you would like to see changed, any feedback at all - I would be happy to work on it. The fact there are so many people commenting on this seems to indicate to me you are indeed missing a feature that people want.
@jove4015 i haven't had a usecase for this but you're right there's enough comments about it that i'm finally looking into merging this now
this causes an infinite loop
fuzzysort.go('w', ['Rich JavaScript Applications – the Seven Frameworks (Throne of JS, 2012) - Steve Sanderson’s blog - As seen on YouTube™'], {normalizeDiacritics:true})
also, this highlights incorrectly
fuzzysort.go('ジ', ['ファイナルファンタジーXIV スターターパック'])[0].highlight()
// 'ファイナルファンタ<b>ジー</b>XIV スターターパック'
i'm thinking of using this to remove accents so it doesn't screw with japanese characters, but i'm not sure if this causes other problems
var remove_accents = (str) => str.replace(/\p{Script=Latin}+/gu, match => match.normalize('NFD')).replace(/[\u0300-\u036f]/g, '')
We just implemented your fuzzy sort into the search function in our grid component and it is working great - our search accuracy and performance has improved substantially.
Along the way, we were asked to include functionality to normalize accents and diacritics:
ü should match u (ie, "uber" should match the string "Über") é should match e Å should match a fi should match fi
And so on. We found that the best way to introduce this was to modify this library, and so I'm submitting it now to be included.
Normalization is triggered by a configuration option "normalizeDiacritics" which defaults to false - so any existing implementation should continue to work exactly as before. If this new option is specified, additional parsing occurs in the prepareLowerInfo function. Every time this option is changed, the built-in cache is cleared so that unnormalized results don't contaminate normalized searches and vice versa.
I didn't spend a lot of time on the performance impacts of this - obviously normalizing all the strings comes with a cost, but for our use case this is pretty negligible. I hope that by making this optional and not default, any performance impact is minimized to just those who want to use the functionality.
Thanks for your consideration! Please let me know if there's anything you'd like for me to update.