MPEGGroup / FileFormat

MPEG file format discussions
23 stars 0 forks source link

Clarify how compressed metadata should work #61

Open leo-barnes opened 2 years ago

leo-barnes commented 2 years ago

It would most likely be beneficial to compress Exif and XMP metadata to save space.

XMP is specified with an item_type of mime, which allows you to specify a content_encoding. This could be used to specify any of the usual compression schemes.

Exif is specified with an item_type of Exif, which does not allow specifying an encoding.

Two questions that need clarification:

  1. Is it possible to specify Exif with an item_type of mime and some already specified MIME type? If not, we should introduce some kind of compressed Exif item_type.
  2. Which content encodings should parsers support for Exif and XMP?
cconcolato commented 2 years ago

Thanks for raising the issue. These are good questions.

  1. I could not find any registered MIME type for EXIF at https://www.iana.org/assignments/media-types/media-types.xhtml, so it seems that compressed EXIF would need another item_type or another item feature.
  2. I'm not sure it's an ISOBMFF question. ISOBMFF typically does not recommend or put restrictions on features. Maybe more of a MIAF question. That said, the feature of "compressed boxes" in ISOBMFF (section 8.19 in the latest edition) uses "deflate" so it might be a good candidate.
leo-barnes commented 2 years ago

One issue with using the content_encoding field is that the uncompressed size is not stored anywhere. So a parser has no idea how large it will be when unpacked. Maybe not a critical issue, but it makes it harder to have strict checks.

leo-barnes commented 11 months ago

I think I added a conformance file containing compressed XMP metadata. But we still have no way to have compressed Exif. Maybe we should morph this issue into tracking some way of adding that.

cconcolato commented 3 months ago

@leo-barnes do you plan to submit an example of compressed Exif for the next meeting?

leo-barnes commented 3 months ago

Probably won't have time to do it unfortunately.

leo-barnes commented 3 months ago

On second thought, let me see if I can't put something together...

farindk commented 3 months ago

@leo-barnes libheif generates compressed XMP metadata (deflate) when you encode with --enable-metadata-compression and enabled that during compilation with (-DWITH_DEFLATE_HEADER_COMPRESSION=ON).

One issue with using the content_encoding field is that the uncompressed size is not stored anywhere. So a parser has no idea how large it will be when unpacked. Maybe not a critical issue, but it makes it harder to have strict checks.

At least for deflate, one does usually decode the data in chunks. Thus, the application can always throw an error when it exceeds a security limit. Having an uncompressed size field in the file also would not increase security since it is not guaranteed that the compressed data will match the size indicated there.