beyondgrep / ack2

**ack 2 is no longer being maintained. ack 3 is the latest version.**
https://github.com/beyondgrep/ack3/
Other
1.48k stars 138 forks source link

feature: unicode normalize #658

Closed rurban closed 4 years ago

rurban commented 6 years ago

ack could be the only file-grep tool which could support unicode normalization. Even the coreutil grep with multibyte support does not find "Café" in a file with "Café", the first using the decomposed "e\x301", the second using the composed "\e9" for the last small e with grave. See e.g. http://perl11.org/blog/foldcase.html

This could be done by using fc with -i, and not just $str = "(?i)$str", => $str = "(?iu)$str" on perl versions, which do support case-folding with /u and unicode patterns. perldoc perlre /u. /l could also be supported for the current local encoding.