Exiv2 / exiv2

Image metadata library and tools
http://www.exiv2.org/
Other
912 stars 280 forks source link

JXL/bmff support #1503

Closed arun54321 closed 3 years ago

arun54321 commented 3 years ago

Currently it doesn't recognize jxl images. The file contains data of an unknown image type

clanmills commented 3 years ago

I've never heard of a jpeg-xl image. Can you provide a test sample, please. More importantly, what is the use case for this format and why should Team Exiv2 undertake the work to support this?

The engineering effort to support a new image format is considerable. There has been a huge effort in the last few months to add support for bmff (.CR3, .HEIC, .AVIF). I'm delighted to say that we released Exiv2 v0.27.4 RC1 earlier today. https://discuss.pixls.us/t/exiv2-v0-27-4-rc1-is-available/24139

1div0 commented 3 years ago

JPEG XL is impressive, but it is still too early to focus on the support until it becomes available in the real, not virtual world.

kmilos commented 3 years ago

AFAICT, it is also BMFF based, no?

1div0 commented 3 years ago

No. C https://cloud.reflexion.tv/nextcloud/index.php/s/tQpN2gp8eZtnrmT

clanmills commented 3 years ago

The file smells more of JPEG than bmff.

509 rmills@rmillsm1:~/Downloads $ dmpf 20200717_221452.jxl count=40
       0        0: .......'.#J....EQ._.i_QA.`28...T  ->  ff 0a fa bb e8 f3 e1 27 85 23 4a 01 03 0a 10 45 51 14 00 08 69 00 51 41 01 60 32 38 9e 09 a0 54
    0x20       32: E....y..                          ->  45 8f b6 a4 fb 79 1d 9d
510 rmills@rmillsm1:~/Downloads $ tvisitor -pR 20200717_221452.jxl 
unknown format  20200717_221452.jxl
511 rmills@rmillsm1:~/Downloads $ dmpf ~/Stonehenge.jpg count=40
       0        0: ....;.Exif__II*_.___._..._.___._  ->  ff d8 ff e1 3b a8 45 78 69 66 00 00 49 49 2a 00 08 00 00 00 0b 00 0f 01 02 00 12 00 00 00 92 00
    0x20       32: __..._._                          ->  00 00 10 01 02 00 0c 00
512 rmills@rmillsm1:~/Downloads $ 

If that file has Exif data, it's been compressed or something.

528 rmills@rmillsm1:~/Downloads $ dmpf ~/Stonehenge.jpg | grep II\\*_
       0        0: ....;.Exif__II*_.___._..._.___._  ->  ff d8 ff e1 3b a8 45 78 69 66 00 00 49 49 2a 00 08 00 00 00 0b 00 0f 01 02 00 12 00 00 00 92 00
   0x340      832: Nikon_..__II*_.___9_._._.___0211  ->  4e 69 6b 6f 6e 00 02 10 00 00 49 49 2a 00 08 00 00 00 39 00 01 00 07 00 04 00 00 00 30 32 31 31
  0x46e0    18144: _.3..M..E.w..)...X....MPF_II*_._  ->  00 10 33 9e ec 4d fa f7 45 c9 77 18 cd 29 d2 87 c2 58 ff e2 0f fe 4d 50 46 00 49 49 2a 00 08 00
0x5e8ee0  6196960: ________________......Exif__II*_  ->  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ff d8 ff e1 03 fe 45 78 69 66 00 00 49 49 2a 00
0x5f22e0  6234848: ________________......Exif__II*_  ->  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ff d8 ff e1 03 fe 45 78 69 66 00 00 49 49 2a 00
529 rmills@rmillsm1:~/Downloads $ dmpf 20200717_221452.jxl | grep II\\*_
530 rmills@rmillsm1:~/Downloads $ dmpf 20200717_221452.jxl | grep -i exi
 0xbf460   783456: ..p........t...q+SF.....f.,._exi  ->  b7 b7 70 9a a1 a7 da f9 b0 b1 7f 74 03 af a3 71 2b 53 46 87 19 c7 09 f5 66 1c 2c 89 5f 65 78 69
 0xd4100   868608: .|.....~.....XEXIK..y5...M......  ->  7f 7c 82 9d c0 da c0 7e eb ce bf c7 e2 58 45 58 49 4b bc 86 79 35 9b 91 9f 4d 07 09 03 e0 18 83
531 rmills@rmillsm1:~/Downloads $ 

jpeg

clanmills commented 3 years ago

Without a request from a major Exiv2 user, such as darktable or The Gimp, this is unlikely to be undertaken. Our priorities in 2021:

  1. v0.27.4 with bmff support
  2. v0.28 'main'
  3. transition to KDE
1div0 commented 3 years ago

Encoding

peter.kovar@Pascal /1TB/usr/src/gitlab.com/wg1/jpeg-xl
€ time build/tools/cjxl ~/Obrázky/PNG/20200717_221452.png ~/Nextcloud/Photo/JPEG\ XL/20200717_221452.jxl
  J P E G   \/ |
            /\ |_   e n c o d e r    [v0.2.0 | SIMD supported: AVX2,SSE4,Scalar]Read 4000x6016 image, 13.8 MP/s
Encoding [VarDCT, d1.000, squirrel], 6 threads.
Compressed to 2643100 bytes (0.879 bpp).
4000 x 6016, 6.25 MP/s [6.25, 6.25], 1 reps, 6 threads.real 0m5,657s
user    0m7,788s
sys 0m1,237s

Decoding

peter.kovar@Pascal /1TB/usr/src/gitlab.com/wg1/jpeg-xl
€ time build/tools/djxl ~/Nextcloud/Photo/JPEG\ XL/20200717_221452.jxl ~/Obrázky/PNG/20200717_221452.jxl.png
Read 2643100 compressed bytes [v0.2.0 | SIMD supported: AVX2,SSE4,Scalar]
Done.
4000 x 6016, 36.31 MP/s [36.31, 36.31], 1 reps, 6 threads.
Allocations: 2276 (max bytes in use: 8.434688E+08)real  0m7,194s
user    0m7,398s
sys 0m0,590s
jonsneyers commented 3 years ago

JPEG XL, if it contains metadata, is using ISOBMFF as a container, using the same Exif box (and xml for XMP) as other formats. Additionally, we are planning to add an option to do Brotli-compressed versions of exif and xmp metadata, though maybe it's a bit early to add support for that since JPEG XL Part 2 (which defines these things) is not finalized yet.

clanmills commented 3 years ago

Thank you @jonsneyers. Do you have a sample image which contains Exif/XMP/ICC/IPTC metadata?

20200717_221452.jxl doesn't look much like bmff, because I don't see box names such as ftyp, meta etc as I see in this avif file. There's no II*\0 sequence from the embedded Tiff that holds the Exif data.

543 rmills@rmillsm1:~/gnu/exiv2/team/book $ dmpf files/avif.avif count=40
       0        0: ___ ftypavif____mif1avifmiafMA1B  ->  00 00 00 20 66 74 79 70 61 76 69 66 00 00 00 00 6d 69 66 31 61 76 69 66 6d 69 61 66 4d 41 31 42
    0x20       32: __.0meta                          ->  00 00 01 30 6d 65 74 61
544 rmills@rmillsm1:~/gnu/exiv2/team/book $ 

The reason I said 20200717_221452.jxl file "smells like JPEG" is because it starts with byte 0xff.

jonsneyers commented 3 years ago

There are two kinds of valid jxl files:

  1. A naked codestream, which starts with the "start of jxl codestream" marker 0xFF0A (yes, that smells like JPEG, this is after all a codec standardized by JPEG).
  2. A codestream wrapped in an ISOBMFF container (signature JXL, ftyp jxl).

All render-impacting data is encoded in the codestream, including color space and orientation. So for web delivery, we anticipate option 1 will be most common. All non-render-impacting metadata is stored outside the codestream though, so if you want to have that, you need to use option 2. This includes Exif and XMP, but also JUMBF metadata and "JPEG bitstream reconstruction data", which is all non-image data needed to restore a losslessly recompressed JPEG file bit-exactly.

If you take the current cjxl and encode a PNG or JPEG which has Exif metadata, it will produce an "option 2" file by default and preserve the Exif metadata.

clanmills commented 3 years ago

Thank You, @jonsneyers That's very helpful. I've cloned and I'm building jpeg-xl on macOS as follows.

$ git clone --recurse-submodules -j8 https://gitlab.com/wg1/jpeg-xl.git
$ mkdir jpeg-xl/build 
$ cd jpeg-xml/build
$ cmake ..
$ make

The option --recurse-submodules manages the dependency complex which includes highway, Little-CMS and more:

633 rmills@rmillsmm-local:~/gnu/github/jpeg-xl/.git $ cat config 
[core]
    repositoryformatversion = 0
    filemode = true
    bare = false
    logallrefupdates = true
    ignorecase = true
    precomposeunicode = true
[submodule]
    active = .
[remote "origin"]
    url = https://gitlab.com/wg1/jpeg-xl.git
    fetch = +refs/heads/*:refs/remotes/origin/*
[branch "master"]
    remote = origin
    merge = refs/heads/master
[submodule "third_party/IQA-optimization"]
    url = https://github.com/veluca93/IQA-optimization.git
[submodule "third_party/brotli"]
    url = https://github.com/google/brotli
[submodule "third_party/difftest_ng"]
    url = https://github.com/thorfdbg/difftest_ng.git
[submodule "third_party/googletest"]
    url = https://github.com/google/googletest
[submodule "third_party/highway"]
    url = https://github.com/google/highway
[submodule "third_party/lcms"]
    url = https://github.com/mm2/Little-CMS
[submodule "third_party/lodepng"]
    url = https://github.com/lvandeve/lodepng
[submodule "third_party/sjpeg"]
    url = https://github.com/webmproject/sjpeg.git
[submodule "third_party/skcms"]
    url = https://skia.googlesource.com/skcms
[submodule "third_party/vmaf"]
    url = https://github.com/Netflix/vmaf.git
634 rmills@rmillsmm-local:~/gnu/github/jpeg-xl/.git $ 

I don't totally understand your comment:

If you take the current cjxl and encode a PNG or JPEG which has Exif metadata, it will produce an "option 2" file by default and preserve the Exif metadata.

However, I suspect when the build finishes, we'll be "on the same page". Thank You very much for your help with this.

clanmills commented 3 years ago

I can't build this on macOS.

/Users/rmills/gnu/github/jpeg-xl/lib/extras/codec_jpg.cc:289:5: error: no matching function for call to 'jpeg_mem_src'
    jpeg_mem_src(&cinfo, reinterpret_cast<const unsigned char*>(bytes.data()),
    ^~~~~~~~~~~~
/Library/Frameworks/Mono.framework/Headers/jpeglib.h:959:14: note: candidate function not viable: 2nd argument ('const unsigned char *') would lose const qualifier
EXTERN(void) jpeg_mem_src JPP((j_decompress_ptr cinfo,
             ^
1 error generated.
make[2]: *** [lib/CMakeFiles/jxl_extras-static.dir/extras/codec_jpg.cc.o] Error 1
make[1]: *** [lib/CMakeFiles/jxl_extras-static.dir/all] Error 2
make: *** [all] Error 2

I tried upgrading libjpeg.8.dylib to libjpeg.9.dylib. Same jpeg_mem_src error. So, I mutilated the code with a cast. A few seconds further into the build the compiler encountered issues with tell(), seek() and other I/O function.. So, I thought to enable/disable some of the undocumented options.

(venv) 1031 rmills@rmillsmm-local:~/gnu/github/jpeg-xl/build $ grep -i enable ../CMakeLists.txt | grep -i true
  set(ENABLE_FUZZERS_DEFAULT true)
    set(ENABLE_TCMALLOC_DEFAULT true)
set(JPEGXL_ENABLE_BENCHMARK true CACHE BOOL
set(JPEGXL_ENABLE_EXAMPLES true CACHE BOOL
set(JPEGXL_ENABLE_SJPEG true CACHE BOOL
set(JPEGXL_ENABLE_OPENEXR true CACHE BOOL
set(JPEGXL_ENABLE_SKCMS true CACHE BOOL
(venv) 1032 rmills@rmillsmm-local:~/gnu/github/jpeg-xl/build $ 

I don't know what ENABLE_SJPEG is, however I tried it and still hit the jpeg_mem_src issue.

$ rm -rf CM*
$ cmake .. -DJPEGXL_ENABLE_SJPEG=False
$ make 
.....  rattle rattle rattle .....
error: no matching function for call to 'jpeg_mem_src'
    jpeg_mem_src(&cinfo, reinterpret_cast<const unsigned char*>(bytes.data()),
kmilos commented 3 years ago

@clanmills FWIW, I managed to build under mingw easily:

  1. Use system provided mingw-w64-x86_64-{asciidoc,brotli,clang,giflib,lcms2,libjpeg,libpng,libwebp,ninja,openexr}
  2. Download and unpack the latest jpeg-xl tarball (0.3.4 for me)
  3. Run deps.sh to get the rest of dependencies
  4. Follow the rest of the instructions with some small mods:
export CC=clang CXX=clang++
mkdir build
cd build
cmake -GNinja -DCMAKE_BUILD_TYPE=Release -DBUILD_TESTING=OFF ..
cmake --build .

I was then able to convert

tools/cjxl ../../exiv2/test/data/Reagan.jpg Reagan.jxl

Reagan.zip

clanmills commented 3 years ago

Thanks, @kmilos for Reagan.jxl. For sure, that smells of bmff:

691 rmills@rmillsm1:~/Downloads $ dmpf Reagan.jxl count=100
       0        0: ___.JXL ....___.ftypjxl ____jxl   ->  00 00 00 0c 4a 58 4c 20 0d 0a 87 0a 00 00 00 14 66 74 79 70 6a 78 6c 20 00 00 00 00 6a 78 6c 20
    0x20       32: __.ZExif____MM_*___._..__.___._.  ->  00 00 16 5a 45 78 69 66 00 00 00 00 4d 4d 00 2a 00 00 00 08 00 13 01 00 00 03 00 00 00 01 00 c8
    0x40       64: __.._.___._.__.._.___.___..._.__  ->  00 00 01 01 00 03 00 00 00 01 00 82 00 00 01 02 00 03 00 00 00 04 00 00 00 f2 01 03 00 03 00 00
    0x60       96: _._.                              ->  00 01 00 01
692 rmills@rmillsm1:~/Downloads $ 

I expect we'll easily decode that. Support for this might go into Exiv2 v0.27.4 RC2.

The 'naked codestream' JXL (which starts with 0xff0a) will require more investigation. We might get lucky and src/jpgimage.cpp can be easily "tweaked" to deal with that.

@jonsneyers do you have a 'naked codestream' test image with metadata. Is there a specification for this format, or is it covered by a pre-existing spec?

clanmills commented 3 years ago

I made a little change to the tvisitor.cpp code in my book, and:

1055 rmills@rmillsmm-local:~/gnu/exiv2/team/book/build $ ./tvisitor -pR  ../files/Reagan.jxl 
STRUCTURE OF JP2 (jxl ) FILE (MM): ../files/Reagan.jxl
 address |   length | box  | uuid | data
      12 |       20 | ftyp |      | jxl ____jxl  106 120 108 32 0 0 0 0 106 120 108 32
      32 |     5722 | Exif |      | ____MM_*___._..__.__ 0 0 0 0 77 77 0 42 0 0 0 8 0 19 1 0 0 3 0 0
    5754 |     5306 | xml  |      | <?xpacket begin="... 60 63 120 112 97 99 107 101 116 32 98 101 103 105 110 61 34 239 187 191
   11060 |     1707 | jbrd |      | .6.....-........H_.. 194 54 20 221 13 232 8 45 147 149 5 222 11 142 166 8 72 0 7 128
   12767 |    20125 | jxlc |      | .......N..L_@_@.01.$ 255 10 8 4 142 129 16 78 25 6 76 0 64 0 64 128 48 49 15 36
END: ../files/Reagan.jxl
1056 rmills@rmillsmm-local:~/gnu/exiv2/team/book/build $ 

As you can see, we've found the Exif and xml boxes. The option -pR will require work because metadata in AVIF, HEIC and CR3 files is stored in substructures of the meta box. Adding box handlers for xml and Exif is straightforward. I will add this to tvisitor.cpp and add a paragraph to the book.

I am not committing to adding JXL support to Exiv2 at the moment. For sure, we need to know more about naked codestream JXL.

Index: ../tvisitor.cpp
===================================================================
--- ../tvisitor.cpp (revision 5306)
+++ ../tvisitor.cpp (working copy)
@@ -1334,6 +1334,7 @@
     const char*  kJp2Box_hdlr  = "hdlr";
     const char*  kJp2Box_iinf  = "iinf";
     const char*  kJp2Box_iloc  = "iloc";
+    const char*  kJp2Box_JXL   = "JXL ";

     const uint16_t kAppExt     = 0xff21;
     const uint16_t kComExt     = 0xfe21;
@@ -1951,6 +1952,12 @@
             io().read   (&box,4) ; // box

             valid_ = boxName(box) == kJp2Box_jP ;
+            if ( boxName(box) == kJp2Box_JXL ) {
+                start_ = 12;
+                io().seek(start_);
+                io().getLong(endian_); // length
+                io().read   (&box,4) ; // box
+            }
             if ( boxName(box) == kJp2Box_ftyp ) {
                 valid_  = true ;
                 io().read(&box,4);
clanmills commented 3 years ago

I've re-read the statement by @jonsneyers:

All render-impacting data is encoded in the codestream, including color space and orientation. So for web delivery, we anticipate option 1 will be most common. All non-render-impacting metadata is stored outside the codestream though, so if you want to have that, you need to use option 2

I think that means that metadata (Exif, xmp etc) is only stored in the BMFF variant of JXL. There is no metadata in naked codestream JXL which has been optimised for streaming.

So, it looks as though bmff/JXL support in Exiv2 only requires a few lines of code to be ported from my book.

kmilos commented 3 years ago

In the long run, I think parsing the "render-impacting" data in the naked codestream is generally of interest as well, e.g.

But I agree it is out scope for now. And then there is also the JUMBF metadata format. So while we can/should probably go for the low hanging fruit of Exif and XMP in BMFF, we should probably leave the feature request open (or break it down).

1div0 commented 3 years ago

Attention is drawn to the possibility that some of the elements of this document may be the subject of patent rights. ISO shall not be held responsible for identifying any or all such patent rights. Details of any patent rights identified during the development of the document will be in the Introduction and/or on the ISO list of patent declarations received (see www.iso.org/patents) or the IEC list of patent declarations received (see http://patents.iec.ch).

Déjà vu.

clanmills commented 3 years ago

@1div0 The statement Attention is drawn..." was cut'n'pasted from the ISO stardard into README.md. That's not a coincidence. The responsibility for using the libexiv2 bmff code is passed via the application to the user.

@kmilos. I don't think there's any difficulty in treating a naked codestream JXL as unknown image type while parsing bmff JXL. Exiv2 supports Exif/XMP/IPTC. ICC is not supported in all image formats (see man/man1/exiv2.1).

I've never heard of JUMBF metadata before this discussion. The "Unified Metatadata Container" is required to provide the architecture for metadata standards in addition to Exif/XMP/IPTC. Andreas has a prototype implementation in the "unstable" branch on SVN. Porting that code into Exiv2 would be a very valuable project for a new maintainer.

I'd like to point out that bmff JXL is only low hanging fruit thanks to my book and src/bmffimage.cpp. A year ago, we could not reach that fruit and that was before the legal issue was added to the bmff challenge. JUMBF could be quite a stretch.

clanmills commented 3 years ago

I've also realised (after re-reading) that @jonsneyers said:

Additionally, we are planning to add an option to do Brotli-compressed versions of exif and xmp metadata, though maybe it's a bit early to add support for that since JPEG XL Part 2 (which defines these things) is not finalized yet.

I've never heard of brotli compression. Presumably there's a library required to deal with that. It looks like naked codestream JXL is something for the future.

To support bmff/JXL, I think we only have to add support for boxes Exif and xml to bmffimage.cpp and update the test suite with Milos' file Reagan.jxl. I'll add the code to my book and see what's involved. If it's as easy as I suspect, we could add support for bmff/JXL to Exiv2 v0.27.4 RC2 which is scheduled for 2021-04-30.

veluca93 commented 3 years ago

There is no Exif or XMP metadata in a "naked codestream" JXL file - it's just not supported by the format, so nothing needs to be done there.

Leaving support for brotli-compressed boxes to the future is perfectly fine - they are not even fully specified yet, while Exif and xml boxes are and should be exactly the same as the same boxes in other formats.

clanmills commented 3 years ago

Thank you @veluca93. You've confirmed my understanding of the situation.

You are quite correct that the Exif and xml boxes can appear in any bmff file. CR3, HEIF (and AVIF) use other features of bmff to store that metadata. The Exiv2 bmffimage.cpp code handles those more devious files and doesn't handle the simpler boxes!

It is easy to add the xml and Exif box handlers. I did it in a few minutes on the tvisitor.cpp code in my book. I'll update the book tomorrow.

I've been speaking with @alexvanderberkel about whether we should include support for JXL in Exiv2 v0.27.4 (scheduled for 2012-05-22. It's more likely for Exiv2 v0.28 which is scheduled for 2021-09-15. "Code complete" for v0.27.4 was 2021-02-29.

Here's output from tvisitor on Milos' test file Reagan.jpx

1101 rmills@rmillsmm-local:~/gnu/exiv2/team/book/build $ ./tvisitor  -pR ../files/Reagan.jxl 
STRUCTURE OF JP2 (jxl ) FILE (MM): ../files/Reagan.jxl
 address |   length | box  | uuid | data
      12 |       20 | ftyp |      | jxl ____jxl  106 120 108 32 0 0 0 0 106 120 108 32
      32 |     5722 | Exif |      | ____MM_*___._..__.__ 0 0 0 0 77 77 0 42 0 0 0 8 0 19 1 0 0 3 0 0
  STRUCTURE OF TIFF FILE (MM): ../files/Reagan.jxl:44->5710
   address |    tag                                  |      type |    count |    offset | value
        10 | 0x0100 Exif.Image.ImageWidth            |     SHORT |        1 |           | 200
        22 | 0x0101 Exif.Image.ImageLength           |     SHORT |        1 |           | 130
        34 | 0x0102 Exif.Image.BitsPerSample         |     SHORT |        4 |       242 | 8 8 8 8
        46 | 0x0103 Exif.Image.Compression           |     SHORT |        1 |           | 1
        58 | 0x0106 Exif.Image.PhotometricInterpre.. |     SHORT |        1 |           | 2
        70 | 0x010e Exif.Image.ImageDescription      |     ASCII |      403 |       250 | 040621-N-6536T-062
USS Ronald Reagan +++
        82 | 0x010f Exif.Image.Make                  |     ASCII |       18 |       653 | NIKON CORPORATION
        94 | 0x0110 Exif.Image.Model                 |     ASCII |       10 |       671 | NIKON D1X
       106 | 0x0112 Exif.Image.Orientation           |     SHORT |        1 |           | 1
       118 | 0x0115 Exif.Image.SamplesPerPixel       |     SHORT |        1 |           | 4
       130 | 0x011a Exif.Image.XResolution           |  RATIONAL |        1 |       681 | 3000000/10000
       142 | 0x011b Exif.Image.YResolution           |  RATIONAL |        1 |       689 | 3000000/10000
       154 | 0x011c Exif.Image.PlanarConfiguration   |     SHORT |        1 |           | 1
       166 | 0x0128 Exif.Image.ResolutionUnit        |     SHORT |        1 |           | 2
       178 | 0x0131 Exif.Image.Software              |     ASCII |       40 |       697 | Adobe Photoshop Elements 12.0 Macintosh
       190 | 0x0132 Exif.Image.DateTime              |     ASCII |       20 |       737 | 2016:09:13 11:58:16
       202 | 0x013b Exif.Image.Artist                |     ASCII |       34 |       757 | Photographerís Mate 3rd Class (A
       214 | 0x8769 Exif.Image.ExifTag               |      LONG |        1 |           | 792
    STRUCTURE OF TIFF FILE (MM): ../files/Reagan.jxl:44->5710
     address |    tag                                  |      type |    count |    offset | value
         794 | 0x829a Exif.Photo.ExposureTime          |  RATIONAL |        1 |      1254 | 1/125
         806 | 0x829d Exif.Photo.FNumber               |  RATIONAL |        1 |      1262 | 5/1
         818 | 0x8822 Exif.Photo.ExposureProgram       |     SHORT |        1 |           | 1
         830 | 0x9000 Exif.Photo.ExifVersion           | UNDEFINED |        4 |           | 0220
         842 | 0x9003 Exif.Photo.DateTimeOriginal      |     ASCII |       20 |      1270 | 2004:06:21 23:37:53
         854 | 0x9004 Exif.Photo.DateTimeDigitized     |     ASCII |       20 |      1290 | 2004:06:21 23:37:53
         866 | 0x9101 Exif.Photo.ComponentsConfigura.. | UNDEFINED |        4 |           | ..._
         878 | 0x9102 Exif.Photo.CompressedBitsPerPi.. |  RATIONAL |        1 |      1310 | 4/1
         914 | 0x9204 Exif.Photo.ExposureBiasValue     | SRATIONAL |        1 |      1334 | 1/3
         926 | 0x9205 Exif.Photo.MaxApertureValue      |  RATIONAL |        1 |      1342 | 3/1
         938 | 0x9207 Exif.Photo.MeteringMode          |     SHORT |        1 |           | 2
         950 | 0x9208 Exif.Photo.LightSource           |     SHORT |        1 |           | 10
         962 | 0x9209 Exif.Photo.Flash                 |     SHORT |        1 |           | 0
         974 | 0x920a Exif.Photo.FocalLength           |  RATIONAL |        1 |      1350 | 42/1
        1022 | 0xa000 Exif.Photo.FlashpixVersion       | UNDEFINED |        4 |           | 0100
        1034 | 0xa001 Exif.Photo.ColorSpace            |     SHORT |        1 |           | 65535
        1046 | 0xa002 Exif.Photo.PixelXDimension       |      LONG |        1 |           | 200
        1058 | 0xa003 Exif.Photo.PixelYDimension       |      LONG |        1 |           | 130
        1082 | 0xa300 Exif.Photo.FileSource            | UNDEFINED |        1 |           | .
        1094 | 0xa301 Exif.Photo.SceneType             | UNDEFINED |        1 |           | .
        1106 | 0xa401 Exif.Photo.CustomRendered        |     SHORT |        1 |           | 0
        1118 | 0xa402 Exif.Photo.ExposureMode          |     SHORT |        1 |           | 1
        1130 | 0xa403 Exif.Photo.WhiteBalance          |     SHORT |        1 |           | 1
        1166 | 0xa406 Exif.Photo.SceneCaptureType      |     SHORT |        1 |           | 0
        1190 | 0xa408 Exif.Photo.Contrast              |     SHORT |        1 |           | 0
        1202 | 0xa409 Exif.Photo.Saturation            |     SHORT |        1 |           | 0
        1214 | 0xa40a Exif.Photo.Sharpness             |     SHORT |        1 |           | 0
    END: ../files/Reagan.jxl:44->5710
       226 | 0x8825 Exif.Image.GPSTag                |      LONG |        1 |           | 1400
    STRUCTURE OF TIFF FILE (MM): ../files/Reagan.jxl:44->5710
     address |    tag                                  |      type |    count |    offset | value
        1402 | 000000 Exif.GPSInfo.GPSVersionID        |     UBYTE |        4 |           | 2 2 0 0
    END: ../files/Reagan.jxl:44->5710
  END: ../files/Reagan.jxl:44->5710
    5754 |     5306 | xml  |      | <?xpacket begin="... 60 63 120 112 97 99 107 101 116 32 98 101 103 105 110 61 34 239 187 191
   11060 |     1707 | jbrd |      | .6.....-........H_.. 194 54 20 221 13 232 8 45 147 149 5 222 11 142 166 8 72 0 7 128
   12767 |    20125 | jxlc |      | .......N..L_@_@.01.$ 255 10 8 4 142 129 16 78 25 6 76 0 64 0 64 128 48 49 15 36
END: ../files/Reagan.jxl
1102 rmills@rmillsmm-local:~/gnu/exiv2/team/book/build $ ./tvisitor  -pX ../files/Reagan.jxl | xmllint --pretty 1 -
<?xml version="1.0"?>
<?xpacket begin="" id="W5M0MpCehiHzreSzNTczkc9d"?>
<x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="Adobe XMP Core 5.3-c011 66.146729, 2012/05/03-13:40:03        ">
  <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
    <rdf:Description xmlns:xmp="http://ns.adobe.com/xap/1.0/" xmlns:photoshop="http://ns.adobe.com/photoshop/1.0/" xmlns:xmpMM="http://ns.adobe.com/xap/1.0/mm/" xmlns:stRef="http://ns.adobe.com/xap/1.0/sType/ResourceRef#" xmlns:stEvt="http://ns.adobe.com/xap/1.0/sType/ResourceEvent#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:xmpRights="http://ns.adobe.com/xap/1.0/rights/" rdf:about="" xmp:ModifyDate="2016-09-13T11:58:16+01:00" xmp:CreateDate="2004-06-21T23:37:53+01:00" xmp:MetadataDate="2016-09-13T11:58:16+01:00" xmp:CreatorTool="Adobe Photoshop Elements 6.0 Macintosh" photoshop:Instructions="Credit as U.S. Navy photo by Elizabeth Thompson. " photoshop:CaptionWriter="Dir. NVNS" photoshop:Urgency="5" photoshop:City="Straits of Magellan" photoshop:Category="N" photoshop:Country="South America" photoshop:Credit="U.S Navy" photoshop:AuthorsPosition="U.S Navy Photographer" photoshop:DateCreated="2004-06-21" photoshop:Source="Navy Visual News Service" photoshop:LegacyIPTCDigest="977177A6C759A2BBD07317E3D5921073" photoshop:ColorMode="3" photoshop:ICCProfile="Adobe RGB (1998)" xmpMM:InstanceID="xmp.iid:F77F117407206811822A8C00775B3FDC" xmpMM:DocumentID="uuid:D6CBDC1D8DF2E511BA6BFBE914561F6D" xmpMM:OriginalDocumentID="uuid:D6CBDC1D8DF2E511BA6BFBE914561F6D" dc:format="image/jpeg" xmpRights:Marked="False">
      <photoshop:SupplementalCategories>
        <rdf:Bag>
          <rdf:li>703-614-9154</rdf:li>
          <rdf:li>navyvisualnews@navy.mil</rdf:li>
          <rdf:li>UNCLASSFIED</rdf:li>
        </rdf:Bag>
      </photoshop:SupplementalCategories>
      <xmpMM:DerivedFrom stRef:instanceID="uuid:ec11a6b0-cc13-11d8-9c21-fa22e28297f6" stRef:documentID="adobe:docid:photoshop:1c90e091-c489-11d8-ad7d-b4c1b2598b09"/>
      <xmpMM:History>
        <rdf:Seq>
          <rdf:li stEvt:action="saved" stEvt:instanceID="xmp.iid:F77F117407206811822A8C00775B3FDC" stEvt:when="2016-09-13T11:58:16+01:00" stEvt:softwareAgent="Adobe Photoshop Elements 12.0 Macintosh" stEvt:changed="/"/>
        </rdf:Seq>
      </xmpMM:History>
      <dc:description>
        <rdf:Alt>
          <rdf:li xml:lang="x-default">040621-N-6536T-062
USS Ronald Reagan (CVN 76), June 21, 2004 -  USS Ronald Reagan (CVN 76) sails through the Straits of Magellan on its way to the Pacific Ocean. The Navy&#xED;s newest aircraft carrier is underway circumnavigating South America in transit to its new homeport of San Diego. U.S. Navy photo by Photographer&#xED;s Mate 3rd Class (AW) Elizabeth Thompson. (RELEASE)
                               </rdf:li>
        </rdf:Alt>
      </dc:description>
      <dc:title>
        <rdf:Alt>
          <rdf:li xml:lang="x-default">040621-N-6536T-062</rdf:li>
        </rdf:Alt>
      </dc:title>
      <dc:creator>
        <rdf:Seq>
          <rdf:li>Photographer&#xED;s Mate 3rd Class (A</rdf:li>
        </rdf:Seq>
      </dc:creator>
      <dc:subject>
        <rdf:Bag>
          <rdf:li>ronald reagan</rdf:li>
          <rdf:li>reagan</rdf:li>
          <rdf:li>cvn 76</rdf:li>
          <rdf:li>cvn-76</rdf:li>
          <rdf:li>straights magellan</rdf:li>
          <rdf:li>magellan</rdf:li>
          <rdf:li>carrier</rdf:li>
          <rdf:li>nimitz-class</rdf:li>
          <rdf:li>ship</rdf:li>
          <rdf:li>underway</rdf:li>
        </rdf:Bag>
      </dc:subject>
    </rdf:Description>
  </rdf:RDF>
</x:xmpmeta>
<?xpacket end="w"?>
1103 rmills@rmillsmm-local:~/gnu/exiv2/team/book/build $ 

Here are the drawings from my book which explain how the metadata is stored in CR3 and HEIC/AVIF files: cr3 heic

kmilos commented 3 years ago

There is no Exif or XMP metadata in a "naked codestream" JXL file - it's just not supported by the format, so nothing needs to be done there.

I beg to differ, there is very useful information about the image there, see above.

clanmills commented 3 years ago

@kmilos. I think @veluca93 and @jonsneyers are saying there is no Exif, XML or IPTC metadata in a naked codestream JXL. And you're saying there is useful information there about the image (size, orientation and so on).

I think we're all in agreement, aren't we?

kmilos commented 3 years ago

That's accurate, yes.

veluca93 commented 3 years ago

Yes, there is no Exif, XML or IPTC. There can be an ICC profile (or equivalent), image dimensions and intrinsic size, bit depth of the original image, and an orientation flag as far as I remember on the top of my head - as well as a preview, which might be considered a form of metadata I guess - but not Exif, XML or IPTC.

The header format for naked-codestream-JXL is (to the best of my knowledge) quite different from anything else, so I wouldn't consider supporting that a priority. Reading the ICC profile in particular requires a quite large chunk of the JXL decoder itself.

kmilos commented 3 years ago

Thanks Luca, that was very helpful information.

clanmills commented 3 years ago

I've added a new section to my book concerning JXL: https://clanmills.com/exiv2/book/#JXL

Thank you to @veluca93 and @jonsneyers for very useful information. Thank you to @kmilos and @1div0 for your participation in this discussion.


JPEG-XL Format

The JXL format is the current contendor to replace JPEG/GIF as the most popular image format. At the time of writing (2021), it is too early to say if it will reach the goal that eluded PNG, JP2 and WebP. There is a discussion of this format here: https://github.com/Exiv2/exiv2/issues/1503.

JPEG-XL is the only format discussed in this book which has two file layouts. The first format is naked codestream JXL. The first two bytes are 0xff0a. I have no further information about this stream. It does not contain Exif, IPTC or XML data. In correspondance with the authors of the JPEG-XL standard, they explained: Additionally, we are planning to add an option to do Brotli-compressed versions of exif and xmp metadata, though maybe it's a bit early to add support for that since JPEG XL Part 2 (which defines these things) is not finalized yet.

...book $ dmpf count=20 files/jxl.jxl 
       0        0: .......'.#J....EQ._.              ->  ff 0a fa bb e8 f3 e1 27 85 23 4a 01 03 0a 10 45 51 14 00 08
...book $ 

The other JPEG-XL format is JXL/BMFF and is bmff based:

book $ dmpf count=20 files/Reagan.jxl 
       0        0: ___.JXL ....___.ftyp              ->  00 00 00 0c 4a 58 4c 20 0d 0a 87 0a 00 00 00 14 66 74 79 70
book $ 

As you can see, there is an opening 12 byte box of type JXL which precedes the ftyp box. The meaning of the 4 payload bytes is unknown.

The structure of files/Reagan.jxl is revealed by tvisitor as follows:

...book $ tvisitor -pS files/Reagan.jxl
STRUCTURE OF JXL FILE (MM): files/Reagan.jxl
 address |   length | box  | uuid | data
       0 |       12 | JXL  |      | .... 13 10 135 10
      12 |       20 | ftyp |      | jxl ____jxl  106 120 108 32 0 0 0 0 106 120 108 32
      32 |     5722 | Exif |      | ____MM_*___._..__.__ 0 0 0 0 77 77 0 42 0 0 0 8 0 19 1 0 0 3 0 0
    5754 |     5306 | xml  |      | <?xpacket begin="... 60 63 120 112 97 99 107 101 116 32 98 101 103 105 110 61 34 239 187 191
   11060 |     1707 | jbrd |      | .6.....-........H_.. 194 54 20 221 13 232 8 45 147 149 5 222 11 142 166 8 72 0 7 128
   12767 |    20125 | jxlc |      | .......N..L_@_@.01.$ 255 10 8 4 142 129 16 78 25 6 76 0 64 0 64 128 48 49 15 36
END: files/Reagan.jxl
...book $

The file Reagan.jxl uses 6 box types of which only xml and ftyp are specified in w15177. The boxes are:

Name Specification Purpose
JXL None File identifier
ftyp 4.3.2 File type
Exif None Embedded Tiff File for Exif metadata
xml 8.11.1.2 XML which is XMP
jbrd None JXL Brotli Compressed Data?
jxlc None JXL Code Stream? See below.

The JXL Code Stream starts with 0xff0a and is presumably identical to the naked codestream JXL.

...book $ dmpf count=16 skip=12767 files/Reagan.jxl 
  0x31df    12767: __N.jxlc.......N..L_              ->  00 00 4e 9d 6a 78 6c 63 ff 0a 08 04 8e 81 10 4e
                                                         <-- Len --> <j  x  l c> <--- code stream ------
...book $ 

The image files/Reagan.jxl was created with the utility cjxl as follows:

...tools $ cjxl exiv2/test/data/Reagan.jpg Reagan.jxl

Thanks to Miloš, the instructions for building cjxl on MinGW are here: https://github.com/Exiv2/exiv2/issues/1503#issuecomment-803943178.

veluca93 commented 3 years ago

To answer some of the implicit questions in what you wrote:

The boxes should (eventually) be specified in ISO/IEC 18181-2. I believe that the Exif box is also specified in a JPEG XS standard, but don't quote me on that :P

clanmills commented 3 years ago

@veluca93 Thank You very much for this very useful information. I'll update the book. That's a good trick to put those binary bytes (such as 135) into the payload of the JXL box. Everything you've done makes sense to me.

Best Wishes in your mission. The whole world wants to move on from JPEG/GIF.

kmilos commented 3 years ago

@clanmills I'd also move the Brotli (just more advanced zip/deflate) note to the second (BMFF based) section, as it only pertains to Exif and XMP compression.

So as far as I understand, you get the Exif and xml boxes either as plain-text, or (potentially) as Brotli compressed, but only within a BMFF file in any case...

kmilos commented 3 years ago

Everything you've done makes sense to me.

The header format for naked-codestream-JXL is (to the best of my knowledge) quite different from anything else, so I wouldn't consider supporting that a priority. Reading the ICC profile in particular requires a quite large chunk of the JXL decoder itself.

Well, this part is not quite ideal for metadata parsers (especially given the patent warnings)... Though I see why it would makes sense for image renderers.

kmilos commented 3 years ago

Found some nice public info about the header metadata and ICC decoding here: https://arxiv.org/abs/1908.03565 (Annex A and Annex B are of relevance). Might be slightly out of date, but it's something to chew on for starters; one should verify against the reference jpeg-xl software finally of course.

@clanmills How about clearly splitting and documenting this FR in 2 parts: Exif and xml box extraction from the BMFF container JXL variety first, then naked JXL codestream header and ICC decoding in the future?

clanmills commented 3 years ago

Thanks, @kmilos. This is very helpful.

For sure, it's a two step process to fully support JXL. I think this issue should be renamed "Support JXL/bmff". It's likely to ship in v0.28 on 2021-09-15. You'll see I have set the milestone to v0.27.4, however that's just to keep it visible on my TODO list. I will probably defer it for v0.28.

Can you open another issue (with no milestone) "Support JXL/codestream".

We've had no feedback from @arun54321, so we don't know his expectations.

veluca93 commented 3 years ago

IIRC the only part that is even remotely up to date in that document is the ICC predictor. The source code is probably the best place to get information from, i.e. https://gitlab.com/wg1/jpeg-xl/-/blob/master/lib/jxl/headers.h

arun54321 commented 3 years ago

Should I rename the issue to 'Support JXL/bmff' ?

clanmills commented 3 years ago

Thank You, @arun54321 for your feedback. I have opened #1506 for future work on JPX/codestream and renamed this issue to deal with JPX/bmff. I hope the direction we have taken on this issue meets your hopes and expectations.

clanmills commented 3 years ago

Fix for JXL/bmff added. PR: #1519. This fix will be in Exiv2 v0.27.4 RC2 expected 2021-04-08. Fix will be in Exiv2 v0.27.4 GM scheduled for 2021-04-30.