h2non / filetype

Fast, dependency-free Go package to infer binary file types based on the magic numbers header signature
https://pkg.go.dev/github.com/h2non/filetype?tab=doc
MIT License
2.1k stars 175 forks source link

check quicktime for 'ftype qt ' magic numbers #102

Open nharsh opened 3 years ago

nharsh commented 3 years ago

This PR adds a match case for quicktime movie files that ignores the first four bytes of the first atom but looks for a Type of 'ftyp' and a Major Brand of 'qt '.

It seems that filetype sometimes fails to identify quicktime videos where the first four bytes aren't 0x0,0x0,0x0,0x14 and the 13-16th bytes aren't the type field for an mdat atom. Possibly this is when the first atom is a File Type Compatibility Atom. By adding a case where we ignore the first field (presumably Size) and instead match on the Type and Major Brand fields we hope to catch some mov files that would otherwise be missed.

For reference here is this doc from Apple https://developer.apple.com/library/archive/documentation/QuickTime/QTFF/QTFFChap1/qtff1.html#//apple_ref/doc/uid/TP40000939-CH203-CJBCBIFF which describes an atom that matches our case:

The file type atom has an atom type value of 'ftyp' and contains the following fields:
Size
A 32-bit unsigned integer that specifies the number of bytes in this atom.
Type
A 32-bit unsigned integer that identifies the atom type, typically represented as a four-character code; this field must be set to 'ftyp'.
Major_Brand
A 32-bit unsigned integer that should be set to 'qt ' (note the two trailing ASCII space characters) for QuickTime movie files.

This is the first PR I've opened here so if there is any procedure I've missed please let me know.

jollyjoker992 commented 3 years ago

@h2non Is this possible to merge?