aadsm / jsmediatags

Media Tags Reader (ID3, MP4, FLAC)
Other
745 stars 128 forks source link

ID3v2.3 WXXX (URL) tags parsed incorrectly as UCS-2 #126

Open Quppa opened 4 years ago

Quppa commented 4 years ago

In version 3.9.3 WXXX (URL) ID3v2.3 tags are being parsed incorrectly as UCS-2 rather than ISO-8859-1. From the spec:

All numeric strings and URLs are always encoded as ISO-8859-1.

Example bytes of a WXXX tag where user_description should be 'あ' (UCS-2 encoded) and data should be https://www.example.com: 57 58 58 58 00 00 00 1E 00 00 01 FF FE 42 30 00 00 68 74 74 70 73 3A 2F 2F 77 77 77 2E 65 78 61 6D 70 6C 65 2E 63 6F 6D

I think the issue was introduced with this change: https://github.com/aadsm/jsmediatags/pull/120

Repro: https://codepen.io/Quppa/pen/yLNbPpX

Example MP3 (contains just TIT2 and WXXX): jsmediatags.zip

staltz commented 4 years ago

I can confirm this issue is happening. With the following MP3 (excerpt in hexadecimal):

...
00000280: 006f 006f 0000 0057 5858 5800 0000 5500  .o.o...WXXX...U.
00000290: 0001 fffe 4900 2000 6100 6d00 2000 6100  ....I. .a.m. .a.
000002a0: 2000 6c00 6900 6e00 6b00 2000 7700 6900   .l.i.n.k. .w.i.
000002b0: 7400 6800 2000 6100 2000 6400 6500 7300  t.h. .a. .d.e.s.
000002c0: 6300 7200 6900 7000 7400 6900 6f00 6e00  c.r.i.p.t.i.o.n.
000002d0: 0000 6874 7470 733a 2f2f 6578 616d 706c  ..https://exampl
000002e0: 652e 636f 6d00 4150 4943 0000 5fea 0000  e.com.APIC.._...
...

We want the description to be encoded as UTF-16, but the link is encoded with ISO-8859-1 (as the spec says). jsmediatags 3.9.3 in the function getUserDefinedFields assumes that userDesc and userDefinedData should use the same charset.