libjxl / libjxl

JPEG XL image format reference implementation
BSD 3-Clause "New" or "Revised" License
2.37k stars 235 forks source link

Add image information as xmp metadata to avoid having to parse the bitstream #1806

Open eugenesvk opened 1 year ago

eugenesvk commented 1 year ago

Is your feature request related to a problem? Please describe. Various descriptive image information is rather important in various image workflows. One of the common tools to get such information is exiftool However, currently exiftool doesn't support some of the basic technical info that's available for other image formats

Describe the solution you'd like I'd like your cjxl tool to write some key image information into the resulting image's metadata, e.g.

This could be optional behind a cli flag to save a few bytes to those who don't care about such information

Describe alternatives you've considered I've opened an issue with ExifTool https://github.com/exiftool/exiftool/issues/157, but a dev there mentioned that it might be too complicated for their tool to have to parse the bitstream to extract such info Another alternative would be to parse the jxlinfo output, but that's also a rather poor substitute

Additional context None

kmilos commented 1 year ago

Agreed, and same goes for exiv2 btw - resources are already thin, and there is little interest/will in spending the time in decoding a new bitstream. Unrelated - the introduction of brotli as an extra dependency for other (known) metadata is not fantastic either. So if JXL community wants to help adoption, please help with these metadata tools as well.

novomesk commented 1 year ago

My € 0.02

I think it would be possible to have an option in cjxl to add minimal uncompressed metadata in EXIF or XMP block. However, as option, it would not be used all the time. There are also other applications creating .jxl files so you will not see that metadata too often.

As user, I'd like to have possibility in cjxl to disable metadata compression so that my own files work with current exiv2 now. Older cjxl wrote uncompressed metadata but newer cjxl always compress. Problem is that exiv2 doesn't support brotli-compressed blocks yet. As a consequence, KDE users do not see more info when browsing folders in dolphin application.

Fortunately, brotli is widely available so it could be added to exiv2 as optional dependency (if someone is willing to work on that). I understand that metadata libraries are not against JXL but they would appreciate more help or contributions.

eugenesvk commented 1 year ago

There are also other applications creating .jxl files so you will not see that metadata too often.

out of curiosity, don't they mostly use this reference encoder/library (in which case they'd get that feature as well)?

jonsneyers commented 1 year ago

I'm not really in favor of redundantly adding basic image metadata (like dimensions, bit depth etc) in Exif just to make it easier to find that info. There is a risk that such information doesn't stay in sync with the real information, etc.

Some tools like file, exiv2 and exiftool may want to show such information without introducing a dependency to libjxl though, which is perfectly understandable. This requires writing an independent header parser, something that I think @thebombzen has already worked on because it is also needed for ffmpeg. I think it could be useful to have a reference implementation of that available somewhere on this git repo (either somewhere in this repo, or in a new repo of its own so it's clear you can use it without introducing a dependency on libjxl). Perhaps @lifthrasiir's J40 implementation could also be a good starting point for this (stripping it down to just what is needed to parse a header without doing any actual decoding). I would propose to limit this to just what can be found in the image header without requiring entropy decoding, since that would lead too far. That means the following information could be returned by such a parser:

What would not be 'easy' to parse (i.e. requires implementing entropy decode) is the following:

Still, I think the information that can be retrieved 'easily' is probably sufficient for tools like exiftool/exiv2.

Then there is of course also the 'actual' metadata that is not basic image information that is already in the jxl codestream header, but things like copyright information. I think this is the only thing Exif/XMP should really be used for in jxl. I agree with @novomesk that it would be good to add an option to cjxl to select whether or not to compress such metadata, since uncompressed is (at least for now) better for interoperability.

I do think adding brob support to exiv2 and exiftool would be a good idea; this is useful not just for jxl but it could be used in any ISOBMF-based format. Especially considering that the size of metadata is likely to grow — cf. for example the content authenticity initiative — and brotli compression can make a large difference for XML-based metadata.

novomesk commented 1 year ago

While the most JXL implementations are libjxl-based, the API gives developers freedom to encode naked codestream or container format JXL with or without metadata boxes.

I am not a decision maker but I think not everyone would like the container + metadata default. For example some people like that JXL can be very light and doesn't have big headers like AVIF.

There could be a recommendation to developers add the metadata for interoperability reason but I just say that not everyone will do it that way.

kmilos commented 1 year ago

I do think adding brob support to exiv2 and exiftool

This will probably happen, just depends on resources. After all, exiv2 already leverages zlib/deflate for e.g. compressed Exif/XMP/ICC chunks in PNGs, so this is not an unnatural extension. It's just slightly annoying that a new dependency has to now be included with extra work along with it when a working solution/workhorse exists - a hint of the "not invented here" scent perhaps? 😉

eugenesvk commented 1 year ago

What about the encoder options like the ones from the current cli output JPEG XL encoder v0.7.0 and Encoding [VarDCT, d1.000, effort: 7]? Is this info currently lost or is it also part of some "easy"/"hard" header (similarly to how it's available for the video files)?

easier to find ... There is a risk that such information doesn't stay in sync with the real information, etc.

Sure, but why is this a bigger issue vs. having no easy access to information (also, how big do you think the risk is)?

tomalakgeretkal commented 1 year ago

I'm not really in favor of redundantly adding basic image metadata (like dimensions, bit depth etc) in Exif just to make it easier to find that info.

That is, arguably, the very purpose of metadata.

gitoss commented 11 months ago

What about the encoder options like the ones from the current cli output JPEG XL encoder v0.7.0 and Encoding [VarDCT, d1.000, effort: 7]?

Related to https://github.com/libjxl/libjxl/issues/2507