benrr101 / node-taglib-sharp

A node.js port of mono/taglib-sharp
GNU Lesser General Public License v2.1
42 stars 11 forks source link

"Granular position is too large to be handled with this version of node-taglib-sharp" when parsing ogg file #62

Closed digimezzo closed 2 years ago

digimezzo commented 2 years ago

I discovered an ogg file in the wild that throws this exception when parsed: "Granular position is too large to be handled with this version of node-taglib-sharp".

When debugging, I found that the expression in const absoluteGranularPosition = data.subarray(6, 8).toUlong(false); in oggPageHeader.ts returns 0n.

The check if (absoluteGranularPosition > Number.MAX_SAFE_INTEGER){} considers 0n larger than Number.MAX_SAFE_INTEGER which throws the error. To resolve this problem and allow the file to be parsed, I removed the check if (absoluteGranularPosition > Number.MAX_SAFE_INTEGER){}, for 2 reasons:

  1. 0n converts to 0 fine via Number(absoluteGranularPosition)
  2. If the conversion to Number would fail because the value in absoluteGranularPosition is really larger than Number.MAX_SAFE_INTEGER, I assume that an error will be thrown anyway.

Would this be an acceptable fix or did you add the check because of other risks that I'm missing?

I've attached the file in question. calm3.zip

benrr101 commented 2 years ago

Hmm. That's an interesting issue. I'll take a look shortly to see some more of the details, but I thought I'd give an explanation why that check is there (and I'll do my best off the top of my head). Basically, the granular position ends up being an offset into the file. While newer versions of node support bigint for file offsets, the version I'm currently targeting does not. As such, any file offsets are stored in node-taglib-sharp as regular numbers, and generally this check exists to make sure we don't try to work with files that are bigger than that version of node can support. I'm not entirely sure the behavior of the fs functions if you provide it an unsafe integer.

What's interesting is that 0n is bigger than MAX_SAFE_INTEGER. Let me look into a better way to make this check, and I'll see about getting a patch fix out.

Lastly, thanks for your interest in this project, thoroughly looking into the code, and giving me a sample file! Makes my life easier and makes me feel good someone's actually trying this project!

digimezzo commented 2 years ago

Thank you very much for explaining the reason behind this check. It's indeed interesting why 0n is considered larger than MAX_SAFE_INTEGER. I've done a thorough search on the internet and couldn't yet find an explanation.

You're very welcome. I love your library. I've been using TagLib# for years for an audio player that runs solely on Windows. I've since created a multiplatform version too, using Electron and uses node-taglib-sharp.

benrr101 commented 2 years ago

Initial findings - the first page that's parsed has a granule position of 0n which proceeds successfully. The second one appears to have a granule position of FF FF FF FF FF FF FF FF which is definitely bigger than MAX_SAFE_INTEGER if processed as a ulong. After digging into the spec, it appears that this is a special value

A special value of -1 (in two's complement) indicates that no packets finish on this page.

So it looks like this page might be able to be thrown out or something. I'll try to figure out the implications of this later today.

benrr101 commented 2 years ago

Ok, I've taken a solid look the implications, and it looks like this is just a special granule position value. As far as I can tell (and this digs into the code the taglib# devs prefaced with "if you don't understand this, you are not alone, it is confusing as hell"), the granule position passes through without any modification, so simply detecting the special value and handling it seems fine.

I was mostly wrong in saying that the granule position corresponds to a file offset. The granule position has a different meaning to each ogg codec, but for our purposes, it can be used to calculate the duration of the file. Although some fancy logic applies, it's basically last granule position minus first granule position. And what's more, the 0xFFFFFFFFFFFFFFFF value would only appear in the middle of a file and wouldn't impact the calculation of duration. As such, I don't need to limit this field to a number. However, that's a bigger change, so I'll table it for the moment in favor of a small fix to unblock you.

As a sidenote, the original taglib# pagination code doesn't consider the case of all page segments being 0xFF. If the paginator generates a page like that, the granule position should be changed to 0xFFFFFFFFFFFFFFFF. I'm guessing it is a super unlikely situation, but I wonder how it would impact the file.

Lastly, I've got a hotfix branch ready hotfix/ogg-bigint-comparison, if you'd like to test it. Otherwise, I'll merge it in and have v5.0.1 out shortly.

digimezzo commented 2 years ago

Thank you so much for looking into this and for providing the fix. I must be honest and admit I don't fully understand granule positions and pagination (I'm somewhat relieved though that the TagLib# developers feel the same way). I do understand the modifications though that you've applied to handle this case. Very clean!

I tested your fix and it works perfectly. Thanks again!

benrr101 commented 2 years ago

Excellent, thanks for testing it! I've got just published the new version on npm, https://www.npmjs.com/package/node-taglib-sharp So for now, I'm going to close this issue and open a new one to track removing the safe integer check all together.

I also noticed you're working on adding the MPEG support I haven't been able to get to - I'm absolutely flattered! Feel free to reach out via email if you have any questions. I'm really excited someone else is interested in contributing 😄

digimezzo commented 2 years ago

Thank you for publishing a new version! I'll integrate it in my project. It'll allow me to close a bug :)

I'm indeed attempting to add mpeg4 support. I planned to notify you as soon as I got a little further, as I'm honestly not sure yet where this will lead me. They have some challenging inheritance going on with the Boxes. But I really want to get this to work and won't hesitate to contact you if I get stuck. Thank you.

You really did a tremendous job on this project. The current implementation (the helper classes like ByteVector and File as well as the implementation of the other file formats) really helps for the conversion of the mpeg4 implementation from C# to TypeScript.