Closed tillt closed 4 years ago
Possibly fixed in https://github.com/bogem/id3v2/commit/88f2646e81c1ad6fc4de6006754314e5faaf5634. Can you please test it? 🙏
Thanks a bunch @bogem - I finally got around testing it against some of my previously failed MP3s and all seems fine now - tested v1.1.3.
It appears text data, when UTF16 LE encoded, won't get properly parsed for some frame types.
Consider the following user defined text frame:
00000000: 54 58 58 58 00 00 00 25 00 00 01 ff fe 44 00 49 TXXX...%.....D.I 00000010: 00 53 00 43 00 49 00 44 00 00 00 ff fe 36 00 33 .S.C.I.D.....6.3 00000020: 00 30 00 41 00 43 00 30 00 30 00 38 00 00 00 .0.A.C.0.0.8...
Encoding key is
0x01
which is UTF16 with BOM. The description of thatTXXX
is "DISCID" and its trailing 'D' is encoded by[0x44, 0x00]
.The problem I noticed with UTF16 LE encoded, user defined text frame contents that for the description, the trailing character may get garbled, the following value then starts at the wrong offset, adding garbage at the beginning.
readTillDelims
will seek through that frame's byte stream and will try to detect the expected two trailing zero bytes without adding them to the respective frame-data - in this case the description. The intend is to stop at the delimiter[0x00, 0x00]
, we do however incorrectly consider that last byte of 'D' rune data which happens to be0x00
as being the first byte of the delimiter. We end up with truncated UTF16 data.The user defined text frame parser will then try to read the value starting one byte too early, adding a prefixing
0x00
.The following output is what I get based on id3v2's functionality;
This nasty workaround -- not a fix!
Then produces the correct output:
A proper fix needs to be located in
readTillDelims
, I guess.