wapmorgan / Mp3Info

The fastest PHP library to extract mp3 meta information (duration, bitrate, samplerate and so on) and tags (id3v1, id3v2).
https://wapmorgan.github.io/Mp3Info/
GNU Lesser General Public License v3.0
139 stars 41 forks source link

Doesn't find frame header b/c it's only checking every 2nd Byte #34

Open mbirth opened 6 months ago

mbirth commented 6 months ago

This file: http://archives.bassdrivearchive.com/6%20-%20Saturday/Electronic%20Warfare%20-%20The%20Overfiend/%5b2023.04.08%5d%20Electronic%20Warfare%20-%20Overfiend.mp3

The file seems to be bluntly cut out of a running stream. So the file doesn't begin with a tag or mpeg frame header. To test this locally, you must disable your isValidAudio() method. (Which should probably not insist on the file starting with a header anyways...)

At 0x32, there are the bytes FF 25 which seem to be part of the payload. The first valid mpeg frame starts at 0x46 with FF FB.

Now, your readMpegFrame() method always reads only every second Byte to look for the FF. It does $first_header_byte = $this->readBytes($fp, 1) which advances the pointer by 1 Byte, and then does a fseek($fp, 1, SEEK_CUR) which advances the pointer another Byte before the loop starts again to check the next Byte for 0xFF. So you check Bytes 0x00, 0x02, 0x04, 0x06, etc.

However, in the above file, this check trips over the non-header 0xFF at 0x32, then does $second_header_byte = $this->readBytes($fp, 1) which advances the pointer by 1 (pointer is now at 0x34). And then continues with the fseek() in L366 to set the pointer to 0x35 before the loop starts again. This causes the scan to now only look at all odd Bytes, e.g. the next Bytes it will check for FF will be 0x35, 0x37, 0x39, 0x3B, etc.

This makes it miss various frame headers that have their FF on an even location. Especially the one at 0x46. With the above file, it will run into the $headerSeekLimit and throw an Exception because it doesn't find any frame until then.

If you increase the $headerSeekLimit it will eventually find some frame, but this won't be the first one.

To mitigate this, I've added an else fseek($fp, -1, SEEK_CUR); after the if ((($second_header_byte[0] >> 5)... block in line 364. But you might have a better solution.

mbirth commented 6 months ago

Maybe there's another fseek($fp ,-1, SEEK_CUR); needed after the outer $first_header_byte block as well.

With the file: http://archives.bassdrivearchive.com/6%20-%20Saturday/Electronic%20Warfare%20-%20The%20Overfiend/%5b2023.07.08%5d%20Electronic%20Warfare%20-%20Overfiend.mp3 it totally misses the first frame header at 0x43 because it's on an odd position.

Instead, it finds the FF ED E3 14 at 0x20C which it parses as CODEC_UNDEFINED and subsequently throws various Warning: Trying to access array offset on value of type null until it finally runs into a DivisionByZeroError.

Maybe, there should be a better pre-check for a valid mpeg frame header before trying to parse it as one. So that it continues to look for a valid one, instead of parsing garbage and crashing.

EDIT: A return null; instead of setting it to CODEC_UNDEFINED works for me. It will just look for the next frame (until it reaches $framesCountRead).