WeTransfer / format_parser

file metadata parsing, done cheap
https://rubygems.org/gems/format_parser
Other
62 stars 18 forks source link

Disable `udta` Parsing #233

Closed Kevin-McGonigle closed 1 year ago

Kevin-McGonigle commented 1 year ago

The User Data (udta) ISOBMFF box is currently treated as a standard container box, expecting other ISOBMFF boxes within. However, their contents don't always necesarily conform to the standard, resulting in some incorrect parsing, corrective buffer position updates and warning logs. This PR disables udta box parsing, which will skip parsing their contents.

Kevin-McGonigle commented 1 year ago

lgtm

I have no idea what exactly that change does but it matches your description :)

If you're interested, head to peek-worker logs and you'll see a bunch that look like:

Unexpected IO position after parsing udta box at position X. Box size: Y. Expected position: X + Y. Actual position: Z.

Basically, we try to treat the contents of the udta box as if it confomed to the standard, but it often doesn't, so our parser freaks out and ends up in a weird state. We have a fail-safe built in that resets the buffer to the correct position and generates that log warning when something like this happens - but it's happening pretty often for udta boxes and polluting the logs a bit. So this PR basically stops treating the udta box as a container, which was prompting the parser to attempt to parse its contents, and now treats it as an unknown box, which will now just note the existence of the box without diving inside it.