JamesHeinrich / getID3

http://www.getid3.org/
Other
1.15k stars 246 forks source link

mp4 uuid code fails with unpack(): Type H: not enough input #226

Closed ben-xo closed 4 years ago

ben-xo commented 4 years ago

The full error is

Error: unpack(): Type H: not enough input, need 4, have 0 (on line 1648 of /Users/ben/Documents/workspace/dir2cast/getID3/module.audio-video.quicktime.php)

The file in question is 2.5Gb in size and AtomicParsely shows the following structure (uuid near the bottom):

tom ftyp @ 0 of size: 24, ends @ 24
Atom moov @ 24 of size: 2381262, ends @ 2381286
     Atom mvhd @ 32 of size: 108, ends @ 140
     Atom trak @ 140 of size: 1271710, ends @ 1271850
         Atom tkhd @ 148 of size: 92, ends @ 240
         Atom edts @ 240 of size: 36, ends @ 276
             Atom elst @ 248 of size: 28, ends @ 276
         Atom mdia @ 276 of size: 1271574, ends @ 1271850
             Atom mdhd @ 284 of size: 32, ends @ 316
             Atom hdlr @ 316 of size: 64, ends @ 380
             Atom minf @ 380 of size: 1271470, ends @ 1271850
                 Atom vmhd @ 388 of size: 20, ends @ 408
                 Atom hdlr @ 408 of size: 51, ends @ 459
                 Atom dinf @ 459 of size: 36, ends @ 495
                     Atom dref @ 467 of size: 28, ends @ 495
                         Atom url  @ 483 of size: 12, ends @ 495
                 Atom stbl @ 495 of size: 1271355, ends @ 1271850
                     Atom stsd @ 503 of size: 151, ends @ 654
                         Atom avc1 @ 519 of size: 135, ends @ 654
                             Atom avcC @ 605 of size: 49, ends @ 654
                     Atom stts @ 654 of size: 24, ends @ 678
                     Atom stss @ 678 of size: 21064, ends @ 21742
                     Atom sdtp @ 21742 of size: 130464, ends @ 152206
                     Atom stsc @ 152206 of size: 40, ends @ 152246
                     Atom stsz @ 152246 of size: 521828, ends @ 674074
                     Atom stco @ 674074 of size: 52200, ends @ 726274
                     Atom ctts @ 726274 of size: 545576, ends @ 1271850
     Atom trak @ 1271850 of size: 1109378, ends @ 2381228
         Atom tkhd @ 1271858 of size: 92, ends @ 1271950
         Atom edts @ 1271950 of size: 36, ends @ 1271986
             Atom elst @ 1271958 of size: 28, ends @ 1271986
         Atom mdia @ 1271986 of size: 1109242, ends @ 2381228
             Atom mdhd @ 1271994 of size: 32, ends @ 1272026
             Atom hdlr @ 1272026 of size: 68, ends @ 1272094
             Atom minf @ 1272094 of size: 1109134, ends @ 2381228
                 Atom smhd @ 1272102 of size: 16, ends @ 1272118
                 Atom hdlr @ 1272118 of size: 51, ends @ 1272169
                 Atom dinf @ 1272169 of size: 36, ends @ 1272205
                     Atom dref @ 1272177 of size: 28, ends @ 1272205
                         Atom url  @ 1272193 of size: 12, ends @ 1272205
                 Atom stbl @ 1272205 of size: 1109023, ends @ 2381228
                     Atom stsd @ 1272213 of size: 91, ends @ 1272304
                         Atom mp4a @ 1272229 of size: 75, ends @ 1272304
                             Atom esds @ 1272265 of size: 39, ends @ 1272304
                     Atom stts @ 1272304 of size: 24, ends @ 1272328
                     Atom stsc @ 1272328 of size: 78292, ends @ 1350620
                     Atom stsz @ 1350620 of size: 978408, ends @ 2329028
                     Atom stco @ 2329028 of size: 52200, ends @ 2381228
     Atom udta @ 2381228 of size: 58, ends @ 2381286
         Atom ©TIM @ 2381236 of size: 23, ends @ 2381259                     ~
         Atom ©TSC @ 2381259 of size: 14, ends @ 2381273                     ~
         Atom ©TSZ @ 2381273 of size: 13, ends @ 2381286                     ~
Atom uuid=be7acfcb-97a9-42e8-9c71-999491e3afac @ 2381286 of size: 317568, ends @ 2698854
Atom mdat @ 2698854 of size: 2674658248 (^), ends @ 2677357102
             (^)denotes a 64-bit atom length

 ~ denotes an unknown atom
------------------------------------------------------
Total size: 2677357102 bytes; 53 atoms total.
Media data: 2674658248 bytes; 2698854 bytes all other atoms (0.101% atom overhead).
Total free atom space: 0 bytes; 0.000% waste.
------------------------------------------------------
AtomicParsley version: 0.9.6 (utf8)
------------------------------------------------------

I get the same output with php 7.1.32 and php 7.4.1

I'll see if I can debug any further.

ben-xo commented 4 years ago

I notice the uuid code was only added recently, it has a note from 2019-Oct-31 in the code.

JamesHeinrich commented 4 years ago

Are you sure you're using the most recent version, with the more-recent changes from https://github.com/JamesHeinrich/getID3/commit/ed22a3edc4fcf4c80ed00d9f60d191dcbb2d7881 ? The current line 1648 does not have any unpack, and the symptom you describe is far more likely from prior to the changes made in https://github.com/JamesHeinrich/getID3/commit/ed22a3edc4fcf4c80ed00d9f60d191dcbb2d7881

ben-xo commented 4 years ago

Hey @JamesHeinrich i rebased on latest and the bug is still there. It's actually a container parsing bug!

I have a patch. PR incoming

ben-xo commented 4 years ago

This patch fixes the bug and guards against similar parsing errors. It's also possible that it may be an underlying off-by-some bug with the data passed to the atom-parsing code, though. But I didn't find one, and in any case this patch contains a reasonable safety check.

ben-xo commented 4 years ago

FYI what was happening was that the uuid atom was being encountered (but 0 length) whilst still parsing moov atom. But as you can see from AtomicParsely it's not a subatom of moov.

ben-xo commented 4 years ago

By the way, http://fileformats.archiveteam.org/wiki/Boxes/atoms_format has a shortlist of "known" UUIDs - which of course it's impossible to enumerate, as they're essentially app-specific.

But it would seem the right way to identify "360Fly Sensor Data" or otherwise may be to match on the UUID, rather than the content of the atom?

JamesHeinrich commented 4 years ago

360fly and XMP are now parsed when their matching UUID is found, other recognized UUIDs from the shortlist are included but have no special parsing. https://github.com/JamesHeinrich/getID3/commit/dee9eb7858518f6285f854b19100246e3c31201e