antoinepirlot / Satunes

Modern MP3 Player to listen your local music files on Android Lollipop 5.1.1+.
GNU General Public License v3.0
18 stars 2 forks source link

UTF-8 (and '...') handling #603

Open braoult opened 1 month ago

braoult commented 1 month ago

After starting Satunes, having a look in Albums tab, the following get displayed :

hor

The left faulty album name should show Ainsi Soit Je... (with 3 trailing dots, therefore only ASCII here). The right faulty one should show いぶき (UTF-8).

antoinepirlot commented 1 month ago

Hey,

Thanks for reporting.

Hmm, I thought it was already fixed 🤔.

I will check that, thanks

braoult commented 1 month ago

Thanks a lot. I forgot: I use Android 14 .

antoinepirlot commented 1 month ago

Thanks a lot. I forgot: I use Android 14 .

Thanks

braoult commented 1 month ago

If you need, I can transfer you some faulty Music files, but this would not be possible here, due to © issues...

antoinepirlot commented 1 month ago

If you need, I can transfer you some faulty Music files, but this would not be possible here, due to © issues...

You can send me by email if you prefer at pirlot.antoine@outlook.com

antoinepirlot commented 1 month ago

Issue duplicated for ellipsis except for いぶき.

I also noticed it happens not everytime. I set the album name: "Ainsi Soit Je..." with the elipsis char will show wrong chars instead of "..." but "いぶき ..." (still with ellipsis char) won't show wrong chars.

It's due to formatted chars in files informations by the OS or program with the one you edit names or by Jetpack Compose.

Also, I'll check to make app accepting different formats to avoid this issue, later.

antoinepirlot commented 5 days ago

I don't have enough knowledge about that at this time

braoult commented 5 days ago

Usually, to fix UTF-related issues, one can first dump the data and understand what characters are exactly.

For the "Ainsi Soit Je…" case, I just double-checked, and dumped the data (filenames and Mp3 tags) from the sample I shared with you.

My mistake, some of the ellipsis for this case are not ASCII, but the HORIZONTAL ELLIPSIS UTF-8 character (U+2026) :

This Album MP3 tag should be decoded (maybe unnecessary) and displayed as standard UTF-8. How do you proceed in this case to get a wrong display ?

Note: U+2026 is hex: e2 80 a6

EDIT: It may be more complicated than I thought: A 2015 discussion about mp3tag community seems to indicate some choices have to be made, which may not work everywhere. Maybe you should avoid to spend time on this issue, until you have more information on how the tags should be encoded. It may even be different depending on id3v2 version :-(

antoinepirlot commented 5 days ago

Yeah, I spent a lot of time to get a solution that doesn't make Satunes loads longer.

I checked about id3v2 but I didn't find a way to manage the encoding type with no huge performance impact.

Also, I'm beginning a master degree in computer science, I hope I will find a solution with more knowledge 🤭.

I'm still checking for a solution.

Thanks for the link