clangen / musikcube

a cross-platform, terminal-based music player, audio engine, metadata indexer, and server in c++
https://musikcube.com
BSD 3-Clause "New" or "Revised" License
4.08k stars 295 forks source link

Diacritics break alphabetical order #636

Open Travisyard opened 9 months ago

Travisyard commented 9 months ago

Tested in Musikcube x86_64 on Fedora 3.0.2 and Musikdroid 3.0.2.

Expected behavior: Characters with accent marks and other diacritics should be alphabetized as if they did not have diacritics.

Observed behavior: In the album artists listing in Musikcube and Musikdroid, characters with diacritics are not alphabetized correctly.

Example: "Télépopmusik" should be between "Tautumeitas" and "The Irrepressibles", but it shows up after "True Faith". Pasted image

Here is how Fedora's file manager handles it, which is well-accepted to be the correct way to alphabetize such characters: image

I observed this in the album artists listing, but it is possible that it affects other lists within the app as well.

RokerHRO commented 2 months ago

I guess the sorting is done just by "byte values", so é is 0xC2 0xA9 in UTF-8, which comes after z (0x7a). Proper "human understandable" sorting is much more complicated and also locale-dependent!

There are libraries out there that have implemented that, e.g. ICU, see: https://unicode-org.github.io/icu/userguide/collation/ for more information about that (not so easy as it seems) topic.