jeremija / unipicker

Search unicode characters in console and copy to clipboard
MIT License
75 stars 11 forks source link

Output depends on `$LANG` env variable #12

Open javalsai opened 1 month ago

javalsai commented 1 month ago

Took me a LONG while to debug this, working properly on terminal but not on a hyprland binding...

Basically, unipicker will output properly the chosen character if the $LANG env variable is (apparently) set to an installed locale, if not (empty, c...), it will output an unknown sequence (appears to be only the first byte of the sequence).

image rg is just like grep

javalsai commented 1 month ago

Apparently, it's something about sed?

image

javalsai commented 1 month ago

My guess is that sed without LANG treats all text as simple byte sequences, meanwhile, when given an "advanced" LANG, it's able to parse UTF sequences and include the whole character in the match group, instead of just one byte.