Zeugma440 / atldotnet

Fully managed, portable and easy-to-use C# library to read and edit audio data and metadata (tags) from various audio formats, playlists and CUE sheets
MIT License
460 stars 61 forks source link

Issues when reading/modifying MP4 files containing MDAT atoms with 64bit lengths #261

Closed Numpsy closed 5 months ago

Numpsy commented 5 months ago

The problem

I was trying to use ATL to modify an MP4 file, and noticed that it started saying that the file was invalid after trying to add some XMP to a file, and then reading it back.

Environment

Details

From debugging the code and reading the MP4 specs, it looks like the issue is that the MDAT in the file in question has a 64bit / long length field (as documented in section 4.2 of the ISO_IEC_14496-12 specification), but ATL doesn't handle that and just tries to treat the length as 1:

This causes issues here: image

and also in navigateToAtom which in my case fails to find the UUID atom because the stream offset calculations get messed up by the 1 byte length (and also I think because it doesn't account for the extra 64bit length field in the atom).

Locally changing navigateToAtom to special case the length==1 case seems to fix my case (the file is actually only 4.35 MB in size), though it looks like actually treating the sizes as possibly 64-bit in the general case would be a more wide ranging change.

Code To Reproduce Issue [ Good To Have ]

I found the issue with a demo mp4 file at work, and am just confirming it's ok to make that available as a sample.

Numpsy commented 5 months ago

This file seems to show the issue:

https://github.com/Zeugma440/atldotnet/assets/1178570/26d5e109-ad7a-42ac-82fa-18c7134891f3

(I can't post my original test file, and this test file has it's UUID box before the mdat, so ATL does read the XMP data in that case)

Zeugma440 commented 5 months ago

Thanks a bunch for the detailed diagnosis and the sample file 👍

The problem has been fixed; the fix will be available on next release ✅

Numpsy commented 5 months ago

Thanks :-)

A question, in case there is any potential issue: Does the AudioDataOffset calculation at https://github.com/Zeugma440/atldotnet/blob/5111dfc7a4a9a1cb8503ebdee196fe721f007de3/ATL/AudioData/IO/MP4.cs#L452 need to care about the potentially 16byte header size? (Just wondering if there is scope for the AddZone() at https://github.com/Zeugma440/atldotnet/blob/5111dfc7a4a9a1cb8503ebdee196fe721f007de3/ATL/AudioData/IO/MP4.cs#L457 to get the wrong offset)

Zeugma440 commented 5 months ago

I appreciate your subtle way to inquire about that ugly cast from long to uint I've just commited 😂

The main reason is that switching navigateToAtom's return type to long was creating too many side effects.

What's more, the case you're referring to would only be valid for 4.2 GB+ files, which is technically possible, but very unusual in the case of MP4s/M4As. Maybe a 15-ish hours long audiobook, or an artistic performance of some sorts would qualify?

Agagamand commented 5 months ago

Maybe a 15-ish hours long audiobook, or an artistic performance of some sorts would qualify?

The movie.

Zeugma440 commented 5 months ago

The movie.

The movie Numpsy uploaded? I don't think so : it does use the 64-bit variant in its header, but the first 4 bytes are null, which kinda misses the point.

Any HD movie using the MP4 container? Sure, but ATL doesn't officially supports movie formats. It happens to work with MP4s containing video streams, but hasn't really been designed for that.

If you need it to work on 4GB+ movies for the software you're making, please be more explicit. I'd rather work on an actual requirement than covering theoretical use cases.

Numpsy commented 5 months ago

The movie Numpsy uploaded?

That sample file is just something that was created after someone noticed that Camtasia seems to create files using the 64-bit lengths even if they're small (the original private file I was testing at the time was itself only 4.x megabytes as well).

My comment about the offsets was about the calculated value of AudioDataOffset here (5706 for the attached test file):

image

where I thought the fixed -8 there might be wrong if 64bit fields are in use. If I upload the same test file to https://gpac.github.io/mp4box.js/test/filereader.html then I get this for the mdat start location image which is 8 bytes earlier in the stream.

As that value is passed to structureHelper.AddZone when the file is being written, I wonder if that could cause problems if the offset actually is off?

Zeugma440 commented 5 months ago

where I thought the fixed -8 there might be wrong if 64bit fields are in use.

My bad. I do get your point now, and it has to be fixed. Thanks for highlighting that

Zeugma440 commented 5 months ago

Fix is available on today's v5.22.

Please close the issue if it works on your side~

Numpsy commented 5 months ago

Fix is available on today's v5.22.

Please close the issue if it works on your side~

Thanks, I'll give it a try.

Numpsy commented 5 months ago

Hmm, does the -8 at https://github.com/Zeugma440/atldotnet/blob/68c5f659b3bfb1381eb0902326e71c5f84bd65ba/ATL/AudioData/IO/MP4.cs#L1593 need to take account of the 8/16 byte variable header length? (Otherwise I think there might be an issue when there is a field with a 64bit length with other fields after it, still testing though)

Zeugma440 commented 5 months ago

Absolutely! I've adjusted that for the next release.

Numpsy commented 5 months ago

thanks

Zeugma440 commented 5 months ago

Fix is available on today's v5.23

Please close the issue if everything works on your side

Numpsy commented 5 months ago

Looks good now, thanks.