yumeyao / android_platform_frameworks_av

fork of AOSP. To address 2 issues in *Android Media Service*: 1. speed. 2. text-encoding following standard to resolve tags mojibake issues.
https://android.googlesource.com/platform/frameworks/av/
Other
0 stars 0 forks source link

Review the behavior of handling text-encodings of tags #2

Open yumeyao opened 9 years ago

yumeyao commented 9 years ago

The issue has also been reported in AOSP issue tracker as: https://code.google.com/p/android/issues/detail?id=81428

While discussing(arguing) with AOSP community which solution is right, I'll implement the one that follows the standard strictly here with this ticket created to track the implementation details (like what's in the standard for some format).

I have created a wiki page that is open to everyone for editing. It also serves as a checklist. https://github.com/yumeyao/android_platform_frameworks_av/wiki/Text-encoding-of-tags-in-formats-reference-list

To help with the research of different formats, you can contribute by leaving a comment here and/or editing the wiki page above.

yumeyao commented 9 years ago

Regardless varies format specs, another issue is how to store the tags: converting them to UTF-8 before storing, or storing as it is using an additional UTF-8 BOM to mark UTF-8.

One problem in the latter case is album title. If we have one MP3 and one M4A with the same album title, but in different text-encodings (ANSI for MP3 and UTF-8 for M4A), the difference of UTF-8 BOM may make the album title not identical thus 2 entries of the same album are present, which may confuses the user.

So before big surgery is done to the media server (might change the database logic), converting to UTF-8 seems to be a good solution atm.

DervishD commented 8 years ago

As I specified in my comment, this bug is biting me all the time. If you are interested I can prepare for you a file that gets its metadata misinterpreted no matter the encoding used, both using ID3 and Vorbiscomment.

Thanks A LOT for taking the interest of reporting this and investigating the issue.

Meanwhile, there's an app called ID3Fixer which can be used to fixe the Media Scanner database. It's far from perfect in the sense that it shows ads and the user interface is a bit poor AND fixing a large database doesn't work, but for exceptional cases it can be used.

Also, forcing some files to have ID3v2.3/UTF-16 or ID3v2.4/UTF-8 works for most of my collection, but I have to tweak things on a per-song basis. Cumbersome and idiotic, because the tags are properly handled in any other system I've tested them on.

If you want the test file, let me know.