andygrundman / Audio-Scan

Audio::Scan - Fast Perl XS metadata and tag reader for all common audio file formats
GNU General Public License v2.0
3 stars 13 forks source link

ID3: Fix v2.4 extended header handling - Failure to extract tags #12

Open mw9 opened 2 years ago

mw9 commented 2 years ago

At present, Audio::Scan fails to extract tags from a file tagged with ID3 v2.4 when the tag includes an ID3 extended header. An extended header will be encountered when the tag carries a CRC.

Audio::Scan has no interest in the contents of an extended header, but it needs to be able to skip over it to identify and extract the ID3 frames that follow. The skip logic succeeds with a v2.3 tag, but fails with a v2.4 tag, because of changes in the way extended headers are handled between v2.3 and v2.4.

This proposed change corrects matters by distinguishing the two variants, and handling them accordingly. The logic is implemented in _id3_parse_v2, in src/id3.c

Three further test cases have also been added to handle v2.4 extended headers, supplementing the existing extended header tests which are relevant to v2.3.

Details may be found in section 3.2 Extended header of each of the published ID3 documents: ID3 v2.3: https://id3.org/d3v2.3.0 ID3 v2.4: https://id3.org/id3v2.4.0-structure

The issue was brought to light in a recent Squeezebox forum post: Should LMS read metadata from ID3v2.4? https://forums.slimdevices.com/showthread.php?115010-Should-LMS-read-metadata-from-ID3v2-4

The thread originator is using the mp3fs filesystem to generate/transcode mp3 files on the fly from flac source files. It happens that mp3fs tags the generated mp3 files with ID3 v2.4 tags and includes an ID3 CRC. In consequence, Audio::Scan failed to extract the tags.

I suspect that ID3 CRC's are not otherwise commonly found in the wild.

I have tested the proposed change over a reasonably sized collection of mp3 files tagged & CRC'd under v2.4, and it works as expected. But the collection is not particularly diverse, so surprises may yet emerge. The ID3 documentation does seem to be clear on how the logic should work, and that gives me confidence in the proposed change.

Mentioning @mherger and @ralph-irving as they expressed interest in the forum post.