beetbox / beets

music library manager and MusicBrainz tagger
http://beets.io/
MIT License
12.89k stars 1.82k forks source link

Songs copied to Android device have odd ID3 tags? #1893

Closed jackwilsdon closed 3 years ago

jackwilsdon commented 8 years ago

Problem

So I'm not entirely sure if this is a beets bug or something completely different, however some songs that contain a right single quotation mark in the title are rendered as ’ by pretty much any Android music player (I tested Google Play Music, an ID3 editor and Pulsar Music Player) which all exhibit the issue.

I also have an album that contains a degree sign in it's title, which is rendered as instead of °.

Here's are some screenshots of the issue (from Google Play Music):

And the output from beets list (unicode characters are printed correctly):

Caravan Palace - <|°_°|> - Lone Digger
Caravan Palace - <|°_°|> - Comics
Caravan Palace - <|°_°|> - Aftermath
<continued>
OK Go - Hungry Ghosts - The Writing’s on the Wall
<continued>
Two Door Cinema Club - Tourist History - Eat That Up, It’s Good for You

It seems like it's something to do with how Android's music API (I assume that's what these apps are using) handles unicode in ID3 tags? I'm not sure whether or not this is a bug in the Android API itself or how beets is storing the ID3 tags (possibly related to #1885?).

I am using beet convert -d Music and copying the converted files to my device, however I don't believe this is related (the issue is happening with both flac files (which are converted) and mp3 files (which are not)).

Setup

My configuration (output of beet config) is:

directory: ~/Music/Music
convert:
    max_bitrate: 320
    never_convert_lossy_files: yes
    copy_album_art: no
    format: mp3
    formats:
        mp3:
            command: ffmpeg -i $source -ab 320k -map_metadata 0 $dest
            extension: mp3
        alac:
            command: ffmpeg -i $source -y -vn -acodec alac $dest
            extension: m4a
        aac:
            command: ffmpeg -i $source -y -vn -acodec libfaac -aq 100 $dest
            extension: m4a
        opus: ffmpeg -i $source -y -vn -acodec libopus -ab 96k $dest
        flac: ffmpeg -i $source -y -vn -acodec flac $dest
        ogg: ffmpeg -i $source -y -vn -acodec libvorbis -aq 2 $dest
        wma: ffmpeg -i $source -y -vn -acodec wmav2 -vn $dest
    dest:
    auto: no
    threads: 8
    tmpdir:

    paths: {}
    pretend: no
    quiet: no
    embed: yes
fetchart:
    minWidth: 500
    maxWidth: 1024
    enforce_ratio: yes
    minwidth: 0
    sources:
    - coverart
    - itunes
    - amazon
    - albumart
    cautious: no
    maxwidth: 0
    auto: yes
    cover_names:
    - cover
    - front
    - art
    - album
    - folder
    remote_priority: no
embedart:
    maxwidth: 1024
    remove_art_file: yes
    ifempty: yes
    compare_threshold: 0
    auto: yes

plugins: info convert fetchart embedart missing lastgenre
lastgenre:
    count: 1
    source: album
    force: yes
    min_weight: 10
    auto: yes
    whitelist: yes
    separator: ', '
    fallback:
    canonical: no
missing:
    count: no
    total: no
jackwilsdon commented 8 years ago

So after running exiftool -v3 -l 09\ Eat\ That\ Up\,\ It’s\ Good\ for\ You.mp3, I can see that the file itself does contain the correct tags (as expected, as cmus and other music players on OS X see the name fine).

Here is a royalty-free mp3 with the tags from one of the songs applied to it (using beet import) that exhibits the issue: Kevin MacLeod - Pixelland

Here is the output from exiftool (note: it doesn't seem to be able to handle unicode tags, however the hex value shows that the right single quotation mark is e2 80 99 which is correct):

  ExifToolVersion = 10.08
  FileName = 09 Eat That Up, It...s Good for You.mp3
  Directory = .
  FileSize = 7735810
  FileModifyDate = 1456326549
  FileAccessDate = 1456326907
  FileInodeChangeDate = 1456326549
  FilePermissions = 33188
  FileType = MP3
  FileTypeExtension = MP3
  MIMEType = audio/mpeg
  MPEGAudioVersion = 3
  AudioLayer = 1
  AudioBitrate = 13
  SampleRate = 0
  ChannelMode = 0
  MSStereo = 0
  MPEG_Audio_Bit26-27 = 0
  IntensityStereo = 0
  CopyrightFlag = 0
  OriginalMedia = 0
  Emphasis = 0
  ID3Size = 254340
ID3v2.4.0:
  + [ID3v2_4 directory, 254202 bytes]
  | Title = Eat That Up, It...s Good for You
  | - Tag 'TIT2' (34 bytes):
  |     0014: 03 45 61 74 20 54 68 61 74 20 55 70 2c 20 49 74 [.Eat That Up, It]
  |     0024: e2 80 99 73 20 47 6f 6f 64 20 66 6f 72 20 59 6f [...s Good for Yo]
  |     0034: 75 00                                           [u.]
  | Artist = Two Door Cinema Club
  | - Tag 'TPE1' (22 bytes):
  |     0040: 03 54 77 6f 20 44 6f 6f 72 20 43 69 6e 65 6d 61 [.Two Door Cinema]
  |     0050: 20 43 6c 75 62 00                               [ Club.]
  | Track = 9/10
  | - Tag 'TRCK' (6 bytes):
  |     0060: 03 39 2f 31 30 00                               [.9/10.]
  | Album = Tourist History
  | - Tag 'TALB' (17 bytes):
  |     0070: 03 54 6f 75 72 69 73 74 20 48 69 73 74 6f 72 79 [.Tourist History]
  |     0080: 00                                              [.]
  | PartOfSet = 1/1
  | - Tag 'TPOS' (5 bytes):
  |     008b: 03 31 2f 31 00                                  [.1/1.]
  | RecordingTime = 2010-02-24
  | - Tag 'TDRC' (12 bytes):
  |     009a: 03 32 30 31 30 2d 30 32 2d 32 34 00             [.2010-02-24.]
  | Genre = Indie Rock
  | - Tag 'TCON' (12 bytes):
  |     00b0: 03 49 6e 64 69 65 20 52 6f 63 6b 00             [.Indie Rock.]
  | PictureMIMEType = image/jpeg
  | PictureType = 3
  | PictureDescription = 
  | Picture = .....JFIF......C.........................................................[snip]
  | - Tag 'APIC' (244621 bytes):
  |     00c6: 03 69 6d 61 67 65 2f 6a 70 65 67 00 03 00 ff d8 [.image/jpeg.....]
  |     00d6: ff e0 00 10 4a 46 49 46 00 01 01 00 00 01 00 01 [....JFIF........]
  |     00e6: 00 00 ff db 00 43 00 01 01 01 01 01 01 01 01 01 [.....C..........]
  |     00f6: 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 [................]
  |     0106: 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 [................]
  |     [snip 244541 bytes]
  | BeatsPerMinute = 0
  | - Tag 'TBPM' (3 bytes):
  |    3bc5d: 03 30 00                                        [.0.]
  | Compilation = 0
  | - Tag 'TCMP' (3 bytes):
  |    3bc6a: 03 30 00                                        [.0.]
  | OriginalReleaseTime = 2010-02-17
  | - Tag 'TDOR' (12 bytes):
  |    3bc77: 03 32 30 31 30 2d 30 32 2d 31 37 00             [.2010-02-17.]
  | Language = eng
  | - Tag 'TLAN' (5 bytes):
  |    3bc8d: 03 65 6e 67 00                                  [.eng.]
  | Media = Digital Media
  | - Tag 'TMED' (15 bytes):
  |    3bc9c: 03 44 69 67 69 74 61 6c 20 4d 65 64 69 61 00    [.Digital Media.]
  | Band = Two Door Cinema Club
  | - Tag 'TPE2' (22 bytes):
  |    3bcb5: 03 54 77 6f 20 44 6f 6f 72 20 43 69 6e 65 6d 61 [.Two Door Cinema]
  |    3bcc5: 20 43 6c 75 62 00                               [ Club.]
  | Publisher = Kitsun..
  | - Tag 'TPUB' (10 bytes):
  |    3bcd5: 03 4b 69 74 73 75 6e c3 a9 00                   [.Kitsun...]
  | PerformerSortOrder = Two Door Cinema Club
  | - Tag 'TSOP' (22 bytes):
  |    3bce9: 03 54 77 6f 20 44 6f 6f 72 20 43 69 6e 65 6d 61 [.Two Door Cinema]
  |    3bcf9: 20 43 6c 75 62 00                               [ Club.]
  | UserDefinedText = (ALBUMARTISTSORT) Two Door Cinema Club
  | - Tag 'TXXX' (38 bytes):
  |    3bd09: 03 41 4c 42 55 4d 41 52 54 49 53 54 53 4f 52 54 [.ALBUMARTISTSORT]
  |    3bd19: 00 54 77 6f 20 44 6f 6f 72 20 43 69 6e 65 6d 61 [.Two Door Cinema]
  |    3bd29: 20 43 6c 75 62 00                               [ Club.]
  | UserDefinedText = (Album Artist Credit) Two Door Cinema Club
  | - Tag 'TXXX' (42 bytes):
  |    3bd39: 03 41 6c 62 75 6d 20 41 72 74 69 73 74 20 43 72 [.Album Artist Cr]
  |    3bd49: 65 64 69 74 00 54 77 6f 20 44 6f 6f 72 20 43 69 [edit.Two Door Ci]
  |    3bd59: 6e 65 6d 61 20 43 6c 75 62 00                   [nema Club.]
  | UserDefinedText = (Artist Credit) Two Door Cinema Club
  | - Tag 'TXXX' (36 bytes):
  |    3bd6d: 03 41 72 74 69 73 74 20 43 72 65 64 69 74 00 54 [.Artist Credit.T]
  |    3bd7d: 77 6f 20 44 6f 6f 72 20 43 69 6e 65 6d 61 20 43 [wo Door Cinema C]
  |    3bd8d: 6c 75 62 00                                     [lub.]
  | UserDefinedText = (CATALOGNUMBER) 355028890
  | - Tag 'TXXX' (25 bytes):
  |    3bd9b: 03 43 41 54 41 4c 4f 47 4e 55 4d 42 45 52 00 33 [.CATALOGNUMBER.3]
  |    3bdab: 35 35 30 32 38 38 39 30 00                      [55028890.]
  | UserDefinedText = (MusicBrainz Album Artist Id) 6f1de078-6684-4792-820d-2ffad64c15ed
  | - Tag 'TXXX' (66 bytes):
  |    3bdbe: 03 4d 75 73 69 63 42 72 61 69 6e 7a 20 41 6c 62 [.MusicBrainz Alb]
  |    3bdce: 75 6d 20 41 72 74 69 73 74 20 49 64 00 36 66 31 [um Artist Id.6f1]
  |    3bdde: 64 65 30 37 38 2d 36 36 38 34 2d 34 37 39 32 2d [de078-6684-4792-]
  |    3bdee: 38 32 30 64 2d 32 66 66 61 64 36 34 63 31 35 65 [820d-2ffad64c15e]
  |    3bdfe: 64 00                                           [d.]
  | UserDefinedText = (MusicBrainz Album Id) 54d5c88c-7a5b-4502-93ab-a6b245611a94
  | - Tag 'TXXX' (59 bytes):
  |    3be0a: 03 4d 75 73 69 63 42 72 61 69 6e 7a 20 41 6c 62 [.MusicBrainz Alb]
  |    3be1a: 75 6d 20 49 64 00 35 34 64 35 63 38 38 63 2d 37 [um Id.54d5c88c-7]
  |    3be2a: 61 35 62 2d 34 35 30 32 2d 39 33 61 62 2d 61 36 [a5b-4502-93ab-a6]
  |    3be3a: 62 32 34 35 36 31 31 61 39 34 00                [b245611a94.]
  | UserDefinedText = (MusicBrainz Album Release Country) XE
  | - Tag 'TXXX' (38 bytes):
  |    3be4f: 03 4d 75 73 69 63 42 72 61 69 6e 7a 20 41 6c 62 [.MusicBrainz Alb]
  |    3be5f: 75 6d 20 52 65 6c 65 61 73 65 20 43 6f 75 6e 74 [um Release Count]
  |    3be6f: 72 79 00 58 45 00                               [ry.XE.]
  | UserDefinedText = (MusicBrainz Album Status) Official
  | - Tag 'TXXX' (35 bytes):
  |    3be7f: 03 4d 75 73 69 63 42 72 61 69 6e 7a 20 41 6c 62 [.MusicBrainz Alb]
  |    3be8f: 75 6d 20 53 74 61 74 75 73 00 4f 66 66 69 63 69 [um Status.Offici]
  |    3be9f: 61 6c 00                                        [al.]
  | UserDefinedText = (MusicBrainz Album Type) album
  | - Tag 'TXXX' (30 bytes):
  |    3beac: 03 4d 75 73 69 63 42 72 61 69 6e 7a 20 41 6c 62 [.MusicBrainz Alb]
  |    3bebc: 75 6d 20 54 79 70 65 00 61 6c 62 75 6d 00       [um Type.album.]
  | UserDefinedText = (MusicBrainz Artist Id) 6f1de078-6684-4792-820d-2ffad64c15ed
  | - Tag 'TXXX' (60 bytes):
  |    3bed4: 03 4d 75 73 69 63 42 72 61 69 6e 7a 20 41 72 74 [.MusicBrainz Art]
  |    3bee4: 69 73 74 20 49 64 00 36 66 31 64 65 30 37 38 2d [ist Id.6f1de078-]
  |    3bef4: 36 36 38 34 2d 34 37 39 32 2d 38 32 30 64 2d 32 [6684-4792-820d-2]
  |    3bf04: 66 66 61 64 36 34 63 31 35 65 64 00             [ffad64c15ed.]
  | UserDefinedText = (MusicBrainz Release Group Id) a3597f45-b9d9-4c8a-803e-0a7d0d4d4e9b
  | - Tag 'TXXX' (67 bytes):
  |    3bf1a: 03 4d 75 73 69 63 42 72 61 69 6e 7a 20 52 65 6c [.MusicBrainz Rel]
  |    3bf2a: 65 61 73 65 20 47 72 6f 75 70 20 49 64 00 61 33 [ease Group Id.a3]
  |    3bf3a: 35 39 37 66 34 35 2d 62 39 64 39 2d 34 63 38 61 [597f45-b9d9-4c8a]
  |    3bf4a: 2d 38 30 33 65 2d 30 61 37 64 30 64 34 64 34 65 [-803e-0a7d0d4d4e]
  |    3bf5a: 39 62 00                                        [9b.]
  | UserDefinedText = (Script) Latn
  | - Tag 'TXXX' (13 bytes):
  |    3bf67: 03 53 63 72 69 70 74 00 4c 61 74 6e 00          [.Script.Latn.]
  | ID3_UFID = http://musicbrainz.org5dabd513-6e5a-4d95-a534-35cb3a4cc976
  | - Tag 'UFID' (59 bytes):
  |    3bf7e: 68 74 74 70 3a 2f 2f 6d 75 73 69 63 62 72 61 69 [http://musicbrai]
  |    3bf8e: 6e 7a 2e 6f 72 67 00 35 64 61 62 64 35 31 33 2d [nz.org.5dabd513-]
  |    3bf9e: 36 65 35 61 2d 34 64 39 35 2d 61 35 33 34 2d 33 [6e5a-4d95-a534-3]
  |    3bfae: 35 63 62 33 61 34 63 63 39 37 36                [5cb3a4cc976]
  | Lyrics = 
  | - Tag 'USLT' (6 bytes):
  |    3bfc3: 03 00 00 00 00 00                               [......]
ID3v1:
  + [BinaryData directory, 128 bytes]
  | Title = Eat That Up, It?s Good for You
  | - Tag 0x0003 (30 bytes, string[30]):
  |   760985: 45 61 74 20 54 68 61 74 20 55 70 2c 20 49 74 3f [Eat That Up, It?]
  |   760995: 73 20 47 6f 6f 64 20 66 6f 72 20 59 6f 75       [s Good for You]
  | Artist = Two Door Cinema Club
  | - Tag 0x0021 (30 bytes, string[30]):
  |   7609a3: 54 77 6f 20 44 6f 6f 72 20 43 69 6e 65 6d 61 20 [Two Door Cinema ]
  |   7609b3: 43 6c 75 62 00 00 00 00 00 00 00 00 00 00       [Club..........]
  | Album = Tourist History
  | - Tag 0x003f (30 bytes, string[30]):
  |   7609c1: 54 6f 75 72 69 73 74 20 48 69 73 74 6f 72 79 00 [Tourist History.]
  |   7609d1: 00 00 00 00 00 00 00 00 00 00 00 00 00 00       [..............]
  | Year = 2010
  | - Tag 0x005d (4 bytes, string[4]):
  |   7609df: 32 30 31 30                                     [2010]
  | Comment = 
  | - Tag 0x0061 (30 bytes, string[30]):
  |   7609e3: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [................]
  |   7609f3: 00 00 00 00 00 00 00 00 00 00 00 00 00 09       [..............]
  | Track = 0 9
  | - Tag 0x007d (2 bytes, int8u[2]):
  |   7609ff: 00 09                                           [..]
  | Genre = 187
  | - Tag 0x007f (1 bytes, int8u[1]):
  |   760a01: bb                                              [.]
sampsyo commented 8 years ago

Very strange! A couple of questions come to mind:

jackwilsdon commented 8 years ago
  1. So it looks like any songs using an ASCII quote (which is 27 in hex) render fine. From my initial findings, it seems that Android reads the ID3 tags as ASCII instead of Unicode, which leads to the issues I am experiencing.
  2. It also happens with FLAC files, I suspect because Android is generally handling all music metadata as ASCII.

I've noticed something else unusual; I have another song in my library with a unicode character in the title that renders fine on Android, which furthers my belief that there is an issue with the ID3 tag itself.

The unicode character in the other song is (a floral heart) which renders perfectly on Android.

Here is an exiftool dump of the file that renders the unicode correctly:

  ExifToolVersion = 10.08
  FileName = 02 ... (Ripe & Ruin).mp3
  Directory = .
  FileSize = 2270216
  FileModifyDate = 1456067754
  FileAccessDate = 1456341320
  FileInodeChangeDate = 1456067754
  FilePermissions = 33188
  FileType = MP3
  FileTypeExtension = MP3
  MIMEType = audio/mpeg
  MPEGAudioVersion = 3
  AudioLayer = 1
  AudioBitrate = 3
  SampleRate = 0
  ChannelMode = 1
  MSStereo = 0
  MPEG_Audio_Bit26-27 = 0
  IntensityStereo = 0
  CopyrightFlag = 0
  OriginalMedia = 1
  Emphasis = 0
  VBRFrames = 2760
  VBRBytes = 2100579
  ID3Size = 169637
ID3v2.4.0:
  + [ID3v2_4 directory, 169499 bytes]
  | Title = ... (Ripe & Ruin)
  | - Tag 'TIT2' (19 bytes):
  |     0014: 03 e2 9d a6 20 28 52 69 70 65 20 26 20 52 75 69 [.... (Ripe & Rui]
  |     0024: 6e 29 00                                        [n).]
  | Artist = alt-J
  | - Tag 'TPE1' (7 bytes):
  |     0031: 03 61 6c 74 2d 4a 00                            [.alt-J.]
  | Track = 2/13
  | - Tag 'TRCK' (6 bytes):
  |     0042: 03 32 2f 31 33 00                               [.2/13.]
  | Album = An Awesome Wave
  | - Tag 'TALB' (17 bytes):
  |     0052: 03 41 6e 20 41 77 65 73 6f 6d 65 20 57 61 76 65 [.An Awesome Wave]
  |     0062: 00                                              [.]
  | PartOfSet = 1/1
  | - Tag 'TPOS' (5 bytes):
  |     006d: 03 31 2f 31 00                                  [.1/1.]
  | RecordingTime = 2012
  | - Tag 'TDRC' (6 bytes):
  |     007c: 03 32 30 31 32 00                               [.2012.]
  | Genre = Electronic
  | - Tag 'TCON' (12 bytes):
  |     008c: 03 45 6c 65 63 74 72 6f 6e 69 63 00             [.Electronic.]
  | PictureMIMEType = image/jpeg
  | PictureType = 3
  | PictureDescription = 
  | Picture = .....JFIF...HH..C........................................................[snip]
  | - Tag 'APIC' (98167 bytes):
  |     00a2: 03 69 6d 61 67 65 2f 6a 70 65 67 00 03 00 ff d8 [.image/jpeg.....]
  |     00b2: ff e0 00 10 4a 46 49 46 00 01 01 01 00 48 00 48 [....JFIF.....H.H]
  |     00c2: 00 00 ff db 00 43 00 03 02 02 03 02 02 03 03 03 [.....C..........]
  |     00d2: 03 04 03 03 04 05 08 05 05 04 04 05 0a 07 07 06 [................]
  |     00e2: 08 0c 0a 0c 0c 0b 0a 0b 0b 0d 0e 12 10 0d 0e 11 [................]
  |     [snip 98087 bytes]
  | Private (SubDirectory) -->
  | + [PRIV directory, 69368 bytes]
  | | TRAKTOR4 = DMRT....RDH 0.SKHC....DOMF.....NSRV..ATAD....BDNA....@WTRAq..}} 086/WW[snip]
  | | - Tag 'TRAKTOR4' (69359 bytes):
  | |     0009: 44 4d 52 54 e3 0c 01 00 02 00 00 00 52 44 48 20 [DMRT........RDH ]
  | |     0019: 30 00 00 00 03 00 00 00 53 4b 48 43 04 00 00 00 [0.......SKHC....]
  | |     0029: 00 00 00 00 e2 c2 8c 00 44 4f 4d 46 04 00 00 00 [........DOMF....]
  | |     0039: 00 00 00 00 1a 0b df 07 4e 53 52 56 04 00 00 00 [........NSRV....]
  | |     0049: 00 00 00 00 07 00 00 00 41 54 41 44 9b 0c 01 00 [........ATAD....]
  | |     [snip 69279 bytes]
  | BeatsPerMinute = 89
  | - Tag 'TBPM' (4 bytes):
  |    28f25: 03 38 39 00                                     [.89.]
  | Compilation = 0
  | - Tag 'TCMP' (3 bytes):
  |    28f33: 03 30 00                                        [.0.]
  | OriginalReleaseTime = 2012-05-28
  | - Tag 'TDOR' (12 bytes):
  |    28f40: 03 32 30 31 32 2d 30 35 2d 32 38 00             [.2012-05-28.]
  | InitialKey = 2m
  | - Tag 'TKEY' (4 bytes):
  |    28f56: 03 32 6d 00                                     [.2m.]
  | Language = eng
  | - Tag 'TLAN' (5 bytes):
  |    28f64: 03 65 6e 67 00                                  [.eng.]
  | Media = CD
  | - Tag 'TMED' (4 bytes):
  |    28f73: 03 43 44 00                                     [.CD.]
  | Band = alt-J
  | - Tag 'TPE2' (7 bytes):
  |    28f81: 03 61 6c 74 2d 4a 00                            [.alt-J.]
  | Publisher = Liberator Music
  | - Tag 'TPUB' (17 bytes):
  |    28f92: 03 4c 69 62 65 72 61 74 6f 72 20 4d 75 73 69 63 [.Liberator Music]
  |    28fa2: 00                                              [.]
  | PerformerSortOrder = alt-J
  | - Tag 'TSOP' (7 bytes):
  |    28fad: 03 61 6c 74 2d 4a 00                            [.alt-J.]
  | UserDefinedText = (ALBUMARTISTSORT) alt-J
  | - Tag 'TXXX' (23 bytes):
  |    28fbe: 03 41 4c 42 55 4d 41 52 54 49 53 54 53 4f 52 54 [.ALBUMARTISTSORT]
  |    28fce: 00 61 6c 74 2d 4a 00                            [.alt-J.]
  | UserDefinedText = (Album Artist Credit) alt-J
  | - Tag 'TXXX' (27 bytes):
  |    28fdf: 03 41 6c 62 75 6d 20 41 72 74 69 73 74 20 43 72 [.Album Artist Cr]
  |    28fef: 65 64 69 74 00 61 6c 74 2d 4a 00                [edit.alt-J.]
  | UserDefinedText = (Artist Credit) alt-J
  | - Tag 'TXXX' (21 bytes):
  |    29004: 03 41 72 74 69 73 74 20 43 72 65 64 69 74 00 61 [.Artist Credit.a]
  |    29014: 6c 74 2d 4a 00                                  [lt-J.]
  | UserDefinedText = (CATALOGNUMBER) LIB140CD
  | - Tag 'TXXX' (24 bytes):
  |    29023: 03 43 41 54 41 4c 4f 47 4e 55 4d 42 45 52 00 4c [.CATALOGNUMBER.L]
  |    29033: 49 42 31 34 30 43 44 00                         [IB140CD.]
  | UserDefinedText = (MusicBrainz Album Artist Id) fc7bbf00-fbaa-4736-986b-b3ac0266ca9b
  | - Tag 'TXXX' (66 bytes):
  |    29045: 03 4d 75 73 69 63 42 72 61 69 6e 7a 20 41 6c 62 [.MusicBrainz Alb]
  |    29055: 75 6d 20 41 72 74 69 73 74 20 49 64 00 66 63 37 [um Artist Id.fc7]
  |    29065: 62 62 66 30 30 2d 66 62 61 61 2d 34 37 33 36 2d [bbf00-fbaa-4736-]
  |    29075: 39 38 36 62 2d 62 33 61 63 30 32 36 36 63 61 39 [986b-b3ac0266ca9]
  |    29085: 62 00                                           [b.]
  | UserDefinedText = (MusicBrainz Album Id) 53042259-1287-4f47-9a99-5a7413df7b3f
  | - Tag 'TXXX' (59 bytes):
  |    29091: 03 4d 75 73 69 63 42 72 61 69 6e 7a 20 41 6c 62 [.MusicBrainz Alb]
  |    290a1: 75 6d 20 49 64 00 35 33 30 34 32 32 35 39 2d 31 [um Id.53042259-1]
  |    290b1: 32 38 37 2d 34 66 34 37 2d 39 61 39 39 2d 35 61 [287-4f47-9a99-5a]
  |    290c1: 37 34 31 33 64 66 37 62 33 66 00                [7413df7b3f.]
  | UserDefinedText = (MusicBrainz Album Release Country) AU
  | - Tag 'TXXX' (38 bytes):
  |    290d6: 03 4d 75 73 69 63 42 72 61 69 6e 7a 20 41 6c 62 [.MusicBrainz Alb]
  |    290e6: 75 6d 20 52 65 6c 65 61 73 65 20 43 6f 75 6e 74 [um Release Count]
  |    290f6: 72 79 00 41 55 00                               [ry.AU.]
  | UserDefinedText = (MusicBrainz Album Status) Official
  | - Tag 'TXXX' (35 bytes):
  |    29106: 03 4d 75 73 69 63 42 72 61 69 6e 7a 20 41 6c 62 [.MusicBrainz Alb]
  |    29116: 75 6d 20 53 74 61 74 75 73 00 4f 66 66 69 63 69 [um Status.Offici]
  |    29126: 61 6c 00                                        [al.]
  | UserDefinedText = (MusicBrainz Album Type) album
  | - Tag 'TXXX' (30 bytes):
  |    29133: 03 4d 75 73 69 63 42 72 61 69 6e 7a 20 41 6c 62 [.MusicBrainz Alb]
  |    29143: 75 6d 20 54 79 70 65 00 61 6c 62 75 6d 00       [um Type.album.]
  | UserDefinedText = (MusicBrainz Artist Id) fc7bbf00-fbaa-4736-986b-b3ac0266ca9b
  | - Tag 'TXXX' (60 bytes):
  |    2915b: 03 4d 75 73 69 63 42 72 61 69 6e 7a 20 41 72 74 [.MusicBrainz Art]
  |    2916b: 69 73 74 20 49 64 00 66 63 37 62 62 66 30 30 2d [ist Id.fc7bbf00-]
  |    2917b: 66 62 61 61 2d 34 37 33 36 2d 39 38 36 62 2d 62 [fbaa-4736-986b-b]
  |    2918b: 33 61 63 30 32 36 36 63 61 39 62 00             [3ac0266ca9b.]
  | UserDefinedText = (MusicBrainz Release Group Id) 0d8562eb-7f72-427b-8a0b-984cc5ee7766
  | - Tag 'TXXX' (67 bytes):
  |    291a1: 03 4d 75 73 69 63 42 72 61 69 6e 7a 20 52 65 6c [.MusicBrainz Rel]
  |    291b1: 65 61 73 65 20 47 72 6f 75 70 20 49 64 00 30 64 [ease Group Id.0d]
  |    291c1: 38 35 36 32 65 62 2d 37 66 37 32 2d 34 32 37 62 [8562eb-7f72-427b]
  |    291d1: 2d 38 61 30 62 2d 39 38 34 63 63 35 65 65 37 37 [-8a0b-984cc5ee77]
  |    291e1: 36 36 00                                        [66.]
  | UserDefinedText = (Script) Latn
  | - Tag 'TXXX' (13 bytes):
  |    291ee: 03 53 63 72 69 70 74 00 4c 61 74 6e 00          [.Script.Latn.]
  | ID3_UFID = http://musicbrainz.org875bad60-ef81-42aa-b719-b97455092e45
  | - Tag 'UFID' (59 bytes):
  |    29205: 68 74 74 70 3a 2f 2f 6d 75 73 69 63 62 72 61 69 [http://musicbrai]
  |    29215: 6e 7a 2e 6f 72 67 00 38 37 35 62 61 64 36 30 2d [nz.org.875bad60-]
  |    29225: 65 66 38 31 2d 34 32 61 61 2d 62 37 31 39 2d 62 [ef81-42aa-b719-b]
  |    29235: 39 37 34 35 35 30 39 32 65 34 35                [97455092e45]
  | Lyrics = 
  | - Tag 'USLT' (6 bytes):
  |    2924a: 03 00 00 00 00 00                               [......]
ID3v1:
  + [BinaryData directory, 128 bytes]
  | Title = ? (Ripe & Ruin)
  | - Tag 0x0003 (30 bytes, string[30]):
  |   22a38b: 3f 20 28 52 69 70 65 20 26 20 52 75 69 6e 29 00 [? (Ripe & Ruin).]
  |   22a39b: 00 00 00 00 00 00 00 00 00 00 00 00 00 00       [..............]
  | Artist = alt-J
  | - Tag 0x0021 (30 bytes, string[30]):
  |   22a3a9: 61 6c 74 2d 4a 00 00 00 00 00 00 00 00 00 00 00 [alt-J...........]
  |   22a3b9: 00 00 00 00 00 00 00 00 00 00 00 00 00 00       [..............]
  | Album = An Awesome Wave
  | - Tag 0x003f (30 bytes, string[30]):
  |   22a3c7: 41 6e 20 41 77 65 73 6f 6d 65 20 57 61 76 65 00 [An Awesome Wave.]
  |   22a3d7: 00 00 00 00 00 00 00 00 00 00 00 00 00 00       [..............]
  | Year = 2012
  | - Tag 0x005d (4 bytes, string[4]):
  |   22a3e5: 32 30 31 32                                     [2012]
  | Comment = 
  | - Tag 0x0061 (30 bytes, string[30]):
  |   22a3e9: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [................]
  |   22a3f9: 00 00 00 00 00 00 00 00 00 00 00 00 00 02       [..............]
  | Track = 0 2
  | - Tag 0x007d (2 bytes, int8u[2]):
  |   22a405: 00 02                                           [..]
  | Genre = 52
  | - Tag 0x007f (1 bytes, int8u[1]):
  |   22a407: 34                                              [4]

And here is a copy of the file (audio replaced with Kevin MacLeod - Pixelland again).

EDIT: After doing a preliminary diff of the exiftool outputs, I can't see an obvious difference between the tags.

sampsyo commented 8 years ago

Truly mysterious.

So, we can hypothesize that beets is doing something wrong to files that is causing Android to read their tags using the wrong encoding. Following that lead, has the Alt-J track (which seems to work fine) been "touched" by beets? If so, we might want to reject that hypothesis.

A second hypothesis would be that some characters, but not all, are causing problems. Are there any tracks with degrees symbols or curly apostrophes that do work correctly?

jackwilsdon commented 8 years ago

Sadly I don't have a copy of the alt-J track that has not been "touched" by beets, as my entire music library is managed by it.

I don't have any tracks in my library that have degrees symbols or curly apostrophes that do work, so I can't prove/disprove your second hypothesis.

I find it odd that the degree symbol works but is prefixed with a "B" and I can't work out why. A unicode degree symbol is C2 B0, neither of which represent the letter "B".

sampsyo commented 8 years ago

Yeah, that thing with the capital B is truly strange. Maybe this is some other non-ASCII, non-Unicode encoding we're seeing?

Kernald commented 8 years ago

I noticed a while ago the same issue with the <|°_°|> album from Caravan Palace, copied to my phone via beets alternatives. I'll try to rip the album from the CD again and copy it directly, without using beets, just to check.

sampsyo commented 8 years ago

Wow; we have independent confirmation from another source for the same album! Thanks, @Kernald; keep us posted about what you find.

Kernald commented 8 years ago

So, I have the FLAC files, directly extracted from the CD via SoundJuicer. But as Google Play Music doesn't recognize those, I converted them to MP3 with ffmpeg:

find . -name "*.flac" -exec ffmpeg -i {} -ab 160k -map_metadata 0 -id3v2_version 3 {}.mp3 \;

Here, the album title was displayed correctly:

Input #0, flac, from './Disc 1 - 04 - Aftermath.flac':
  Metadata:
    TITLE           : Aftermath
    ARTIST          : Caravan Palace
    track           : 4
    TRACKTOTAL      : 11
    ALBUM           : <|°_°|>

[…]

Output #0, mp3, to './Disc 1 - 04 - Aftermath.flac.mp3':
  Metadata:
    TIT2            : Aftermath
    TPE1            : Caravan Palace
    TRCK            : 4
    TRACKTOTAL      : 11
    TALB            : <|°_°|>

And… Here's the result. I can't see that as an improvement ;-)

screenshot_20160229-191927

If you have any suggestion, I'll be happy to try!

sampsyo commented 8 years ago

Wow. Truly fascinating. I guess we can conclude ffmpeg has the same problem getting the encoding right for what Android expects?

I tried googling for similar problems. This is a really old bug, and probably not relevant: https://code.google.com/p/android/issues/detail?id=2688

I tested the hypothesis anyway:

>>> print(u'<|°_°|>'.encode('utf8').decode('latin1'))
<|°_°|>

So that doesn't explain either wrong result. :cry:

I can't seem to find any encoding/decoding mismatch that produces exactly these results… maybe it's time to write a script to try them all and see what happens??

sampsyo commented 8 years ago

Well, I gave ftfy a try. Here's what it found:

>>> ftfy.fixes.fix_encoding_and_explain('Writing’s')
('Writing’s', [('encode', 'sloppy-windows-1252', 0), ('decode', 'utf-8', 0)])

And sure enough:

>>> 'Writing’s'.encode('utf8').decode('windows-1252')
'Writing’s'

which indicates that beets is writing UTF-8 and Android is trying to interpret it as a weird Windows codepage. :confused:

Still no leads on what's up with the B° mojibake though… and I don't know how to type those Chinese (?) characters to give those a try.

sampsyo commented 8 years ago

One other sad fact: ID3 doesn't even seem to specify Windows 1252 as one of the possible encodings: https://en.wikipedia.org/wiki/ID3#ID3v2

So it's a mystery as to why Android is using it.

sampsyo commented 8 years ago

OK, sorry for all the commenting, but I apparently can't let this go!

First: I looked at @jackwilsdon's files, and they correctly report the encoding as UTF-8. So that's a dead end.

Some googling revealed that this is probably a bug in Android: https://code.google.com/p/android/issues/detail?id=81428

People still seem to be complaining about it as recently as last month. Apparently, the Android frameworks—for all media formats—just ignore the specified encoding and guess, based on the data, which encoding it uses. Guessing encodings is notoriously difficult, so it frequently guesses wrong. Given that, I'm not sure I can see how to work around this. :cry:

jackwilsdon commented 8 years ago

It sounds like there isn't much we can do sadly, just wait for Android to fix it I guess!

I've been thinking of a fix Beets-side but it feels a bit hacky;

A tagreplace plugin could be written to replace certain characters with others, configured in the user's configuration file. This wouldn't work for tags like <°_°> but it would work for the quotes I think.

lazka commented 8 years ago

In quodlibet we use utf-8 for ascii text and utf-16 for everything else. If Android is really guessing, utf-16 shouldn't give it much choice.

jurf commented 6 years ago

I have the same problem, except it only happens for some files. E.g. Led Zeppelin’s D’yer Mak’er appears as D’yer Mak’er, but The Beatles’ I’ve Got a Feeling appears correctly. Is there any way I can force the same encoding or whatever it is that is used there everywhere, in Beets?

nerone-github commented 6 years ago

Hi, I've stumbled across this issue as well, and yea we know it's Androids fault, but there is an easy way to to fool it, so that the tags are displayed correctly.

If you have a FLAC file which is wrong, just add a russian UTF-8 character instead of using only ANSI (Latin) letters. The result will be that the tag is recognized as UTF-8

Suggestion: Don't use a start-letter to keep correct Alphabetic order

you can replace the following Letters which look the same in cyrillic and latin, but will cause an UTF8 recognition

КОМЕТА ВНРСХ оеа рсх (You can use copy+paste here, these are the cyrillic variants)

The replaced letter can be anywhere within the tags e.g. artist, album, title etc. and it will recognize the file as UTF8

indivisible commented 5 years ago

Might be a slightly different android bug, but I've had similar issues with the android media scanner. (Note: since the default scanner is broken, different vendors might have slightly different "improved" versions with different bugs)

I got things working by only using ID3 v2.4 tags, UTF-8 encoding and no extended headers.

The extended header thing makes things really confusing: if the tag has it, then android will fail to read any info from the tag, and fall back to the v1 one, most likely failing to decode any fancy characters.

I've created a script that I use to make my files "android-safe" (be sure to keep backups!)

lazka commented 5 years ago

mutagen doesn't write extended headers

jurf commented 5 years ago

@lazka: but I guess they could already be present.

lazka commented 5 years ago

@lazka: but I guess they could already be present.

mutagen replaces the header always

jurf commented 5 years ago

Why is this happening then?

lazka commented 5 years ago

@indivisible ^?

tnyeanderson commented 5 years ago

See this issue. Vanilla seems to be the only open source android music player that has gotten around this bug. To do it, they had to build their own database instead of using the mediastore. There is nothing beets can do to fix this, EXCEPT if someone makes a plugin to find and replace characters in ID3 tags. Something like this in config.yaml:

plugins: replacer

replacer:
  fields: all
  replace:
    - ’:'
    - °:*

Should have the option to only replace certain fields (song title, etc)? Or maybe not. This could impact performance when using the autotagger as it will have to check every field. And even then it's not the best solution. The best solution would be for Android to stop playing guessing games and use the encoding that is set by the app.... good luck with that :)

lazka commented 5 years ago

There is nothing beets can do to fix this

https://github.com/beetbox/beets/issues/1893#issuecomment-199165354 should work

tnyeanderson commented 5 years ago

Reading the google issue it doesn't look like UTF-16 will be a panacea. And according to this, support for UTF-16 may not be ubiquitous enough to rationalize such a change.

I am by no means an expert on encoding. Can someone please tell me I'm wrong? :)

tnyeanderson commented 5 years ago

https://stackoverflow.com/a/48270759/5057843

This contradicts info I have read elsewhere (saying UTF-16 DOES fix the issue). Not sure who to believe...

lazka commented 5 years ago

I shouldn't have said that it will fix it, but utf-16 will give the encoding guessing code less chance to select the wrong encoding, at least for otherwise ASCII heavy text:

>>> import chardet
>>> text = "Two Door Cinema Club - Tourist History - Eat That Up, It’s Good for You"
>>> chardet.detect(text.encode("utf-8"))
{'encoding': 'Windows-1252', 'confidence': 0.73, 'language': ''}
>>> chardet.detect(text.encode("utf-16"))
{'encoding': 'UTF-16', 'confidence': 1.0, 'language': ''}
>>> 
tnyeanderson commented 5 years ago

@lazka That's actually very promising. Has it been tested on Android to verify that their guessing algorithm reacts appropriately? I'd settle for a beets config option to convert my library's metadata UTF-16 if it works!!

Karcsii commented 5 years ago

@lazka That's actually very promising. Has it been tested on Android to verify that their guessing algorithm reacts appropriately? I'd settle for a beets config option to convert my library's metadata UTF-16 if it works!!

It doesn't, I have my whole library's id3 tags encoded with UTF-16 and some songs are still displayed with random chinese characters. The most interesting thing is that some songs from an album are displayed with messed up artist and album data, while other songs from the same album same artist are displayed correctly. Screenshot_20190930_174840_com google android music Screenshot_20190930_174810_com google android music

stale[bot] commented 4 years ago

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Karcsii commented 4 years ago

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Issue still relevant and unresolved. Waiting for somebody who knows what causes the problem and can suggest a way to avoid it or until Google fixes it.

sampsyo commented 4 years ago

Hello! For @jtpavlock, I'm not actually sure this is an issue we should keep open as tagged "feature," even though it's still clearly affecting people—because it's not yet clear whether there is anything we (on the beets side) can do about it. One criterion I like to use when transitioning from "needinfo" to a more specific tag is that we have enough information that the issue is now actionable: that is, someone with the time and energy can plausibly do something about it. For now, I think this issue still needs more information before anyone can actually fix it.

jtpavlock commented 4 years ago

@sampsyo makes sense, sorry about that. Since this one seems like an oddball in it may be in extended limbo, I was just trying to think what should be done, if anything, to prevent the repeated stale-bot messages.

lazka commented 4 years ago

I did a quick test with <|°_°|> and utf8/utf16/utf16be and Android guesses in all cases and always wrong, so ignore my suggestion (to use utf16) from before.

sampsyo commented 4 years ago

No worries, @jtpavlock, and thanks for taking a look, @lazka!

stale[bot] commented 4 years ago

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

tnyeanderson commented 4 years ago

An AOSP bug was first reported in 2009 that causes this. In 2014, the root cause and fix for it was suggested.

8 days ago (09.04.2020) this issue was closed as wontfix.

Apparently some manufacturers have implemented a fix in their own distros, but this is manufacturer/device dependent.

I can't tell if the issue is still relevant on my end, but it might be for some users. Not sure how to move forward...

jackwilsdon commented 4 years ago

I'll see if I can still reproduce this issue if I get a second - it seems like it was just closed as it's an old issue, we're free to raise a new one if it's still a problem.

stale[bot] commented 4 years ago

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

jackwilsdon commented 3 years ago

Looks like this issue is still present with Android's MediaStorage API:

screenshot of corrupt text

Using simple music player, which seems to just pull the text straight from the API.

tnyeanderson commented 3 years ago

Confirming this is still relevant. Not sure on the progress of the old android bugs, but the results are the same in the app--wrong character displays.

tnyeanderson commented 3 years ago

Any update on this? If not, in the meantime is there any way to run a find/replace for all instances of a specific character or string for a given tag in the whole library? The alternative is using a separate tool to retag with the more compatible characters, then reimport all tracks to beets as-is (so beets has up to date info in its db).... then continue to check/retag/reimport when there's any problems with future imports.

I have a LOT of issues with a certain character (right single quotation mark). I can find all instances of the character using beet list $(printf "\xE2\x80\x99") but it seems like writing a beet modify script to replace them with a regular apostrophe is a dangerous game due to how beets outputs information to the terminal.

If someone has a solution or workaround, please let me know!

stale[bot] commented 3 years ago

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

tnyeanderson commented 3 years ago

Still relevant

kytta commented 2 years ago

Chiming in to say I have the exact same problem.

It wouldn't be as bad, if everything would be treated as Windows-1252, but here, some songs do and some don't, which creates multiple albums in my media library.

As an example, Oasis' album (What’s the Story) Morning Glory?

Such a shame that Google says it's an "obsolete wontfix", when clearly lots of people have this issue to this day. Yet, I am not sure if we need to keep this issue open as it's not the beets' fault

EDIT: the players I've tried were Auxio, Spotify, and YouTube Music (the last two obviously set to "local files" mode)

makawity commented 1 year ago

Although this is not a problem in Beets, there is a plugin that seems to be able to fix it in Beets library if you want to go that route: https://github.com/edgars-supe/beets-importreplace