drewnoakes / metadata-extractor

Extracts Exif, IPTC, XMP, ICC and other metadata from image, video and audio files
Apache License 2.0
2.53k stars 473 forks source link

Avoiding Redundancy and Maximizing Metadata from ISOBMFF #388

Open payton opened 5 years ago

payton commented 5 years ago

As we begin to add new formats that follow the ISO Base Media File Format spec, we will begin to see more redundancy in our code. For example, the following formats all follow ISO BMFF: HEIF, MP4, CR3, 3GP, 3G2…

I’m hoping to get some feedback on a structure to reduce these redundancies. Ideally, the reader will make decisions based on the major brand and compatible brands while only reimplementing boxes/handlers that are needed.

To demonstrate my point, ISO/IEC 14496-12 (appendix E) states a sequence of boxes that are required to be supported for the isom brand. Subsequently, it states several brands that build on each other: isom -> avc1 -> iso2 -> iso3 ... Each new brand inherits support (and box structure) from the previous brand with a few new additions.

There is also a brand specifically for MP4 that inherits from isom: isom->mp42

An official registry of brands/boxes exists: http://mp4ra.org/#/, but there are still other brands/boxes that are widely used and not registered. Ideally, we would want to make decisions based on these brands for maximum information.

Let’s say MP4 (brand mp42) is not supported and we come across a file with major brand mp42. We don’t want to throw this file out yet. If isom, which is supported, exists in the list of compatible brands, we can still read it and get some valuable information.

Opening this issue for transparency and any input/ideas.

drewnoakes commented 5 years ago

This sounds like a good initiative Payton. I'm not familiar with the format so can't provide more than encouragement at this point.