Exiv2 / exiv2

Image metadata library and tools
http://www.exiv2.org/
Other
914 stars 279 forks source link

no preview for cr3 image #1893

Closed paolobenve closed 2 years ago

paolobenve commented 3 years ago

exiv2 0.27.5.1:

$ exiv2 -ep1 /path/to/my/image.cr3
/path/to/my/image.cr3: Image does not have preview 1

is it ok or am I missing something?

postscript-dev commented 3 years ago

I am working with the exiv2 program and manpage at the moment, so had a quick look at this.

A CR3 test file that contains preview images can be found here. It is worth noting that the thumbnail is included in the list of any previews.

You can list all the available previews in the file using:

$ exiv2 --print p IMG_0482_raw.CR3

but none are listed. I tried this with Exiv2 0.27.4 and 1.0.0.9 . If I use Exiftool then I can extract all the previews (including a JPEG conversion):

exiftool -a -b -W %d%f_%t%-c.%s -preview:all IMG_0482_raw.CR3

The manpage states that Exiv2 supports CR3 reading thumbnails, but this doesn't work. I will update the manpage, but do not have time to investigate further.

clanmills commented 3 years ago

@paolobenve Good to hear from you, Paolo. I hope all's well with you. I don't believe you received any assistance with localisation and that's a pity.

@postscript-dev Thank you for your helpful look at this, Peter.

I don't know the thumbnails/previews code as well I know other parts of Exiv2. There are two ways in which previews/thumbnails are stored in images. The first is in the Exif data (normally in IFD1) and the exiv2 preview manager works on that data. The second depends on the file format and is not part of the Exif standard. I don't think Exiv2 provides any support for thumbnails/previews outside of the Exif metadata. So this is not an oversight in Exiv2 BMFF/CR3 support, it's a general weakness in Exiv2 which requires more development to support format dependant thumbnails/previews.

In my book, I have an in-depth analysis of about 20 common image formats (Tiff, JPEG, PNG and others). The CR3 format has previews stored as illustrated in the uuid/canp and THMD boxes. The tvisitor code in my book understands these previews, although tvisitor does not provide a preview/thumbnail extractor. Here is the drawing from my book of the CR3 format .

cr3

There is a discussion in book of previews and thumbnails. This is an area of Exiv2 that would benefit from more development. As I have now retired, I'm not volunteering to get involved. https://clanmills.com/exiv2/book/#5

clanmills commented 3 years ago

@paolobenve Paolo, when I was running today, I thought of something ultra-nerdy (anything to keep my mind off the pain).

I can prove my case that Exiv2 Preview Manager only understands Thumbnails in IFD1. In the test file Stonehenge.jpg in my book and http://clanmills.com/Stonehenge.jpg, we have:

$ tvisitor -pRU files/Stonehenge.jpg
STRUCTURE OF JPEG FILE (II): files/Stonehenge.jpg
 address | marker       |  length | signature
       0 | 0xffd8 SOI  
       2 | 0xffe1 APP1  |   15272 | Exif__II*_.___._..._.___.___..._.___.___
  STRUCTURE OF TIFF FILE (II): files/Stonehenge.jpg:12->15264
   address |    tag                                  |      type |    count |    offset | value
        10 | 0x010f Exif.Image.Make                  |     ASCII |       18 |       146 | NIKON CORPORATION
        22 | 0x0110 Exif.Image.Model                 |     ASCII |       12 |       164 | NIKON D5300
        34 | 0x0112 Exif.Image.Orientation           |     SHORT |        1 |           | 1
        46 | 0x011a Exif.Image.XResolution           |  RATIONAL |        1 |       176 | 300/1
        58 | 0x011b Exif.Image.YResolution           |  RATIONAL |        1 |       184 | 300/1
        70 | 0x0128 Exif.Image.ResolutionUnit        |     SHORT |        1 |           | 2
        82 | 0x0131 Exif.Image.Software              |     ASCII |       10 |       192 | Ver.1.00 
        94 | 0x0132 Exif.Image.DateTime              |     ASCII |       20 |       202 | 2015:07:16 20:25:28
       106 | 0x0213 Exif.Image.YCbCrPositioning      |     SHORT |        1 |           | 1
       118 | 0x8769 Exif.Image.ExifTag               |      LONG |        1 |           | 222
...
       130 | 0x8825 Exif.Image.GPSTag                |      LONG |        1 |           | 4072
...
IFD1  4322 | 0x0103 Exif.Image.Compression           |     SHORT |        1 |           | 6
      4334 | 0x011a Exif.Image.XResolution           |  RATIONAL |        1 |      4410 | 300/1
      4346 | 0x011b Exif.Image.YResolution           |  RATIONAL |        1 |      4418 | 300/1
      4358 | 0x0128 Exif.Image.ResolutionUnit        |     SHORT |        1 |           | 2
      4370 | 0x0201 Exif.Image.JPEGInterchangeFormat |      LONG |        1 |           | 4426
      4382 | 0x0202 Exif.Image.JPEGInterchangeLength |      LONG |        1 |           | 10837
      4394 | 0x0213 Exif.Image.YCbCrPositioning      |     SHORT |        1 |           | 1
  END: files/Stonehenge.jpg:12->15264
   15276 | 0xffe1 APP1  |    2786 | http://ns.adobe.com/xap/1.0/_<?xpacket b
   18064 | 0xffed APP13 |      96 | Photoshop 3.0_8BIM.._____-..__._...Z_..%
...
   18162 | 0xffe2 APP2  |    4094 | MPF_II*_.___.__.._.___0100..._.___.___..
...
   22831 | 0xffda SOS  
 6196491 | 0xffd9 EOI  
 6196976 | 0xffd8 SOI  
 6196978 | 0xffe1 APP1  |    1022 | Exif__II*_.___._i.._.___._________._..._
  STRUCTURE OF TIFF FILE (II): files/Stonehenge.jpg:6196988->1014
   address |    tag                                  |      type |    count |    offset | value
        10 | 0x8769 Exif.Image.ExifTag               |      LONG |        1 |           | 28
    STRUCTURE OF TIFF FILE (II): files/Stonehenge.jpg:6196988->1014
     address |    tag                                  |      type |    count |    offset | value
          30 | 0xa002 Exif.Photo.PixelXDimension       |     SHORT |        1 |           | 640
          42 | 0xa003 Exif.Photo.PixelYDimension       |     SHORT |        1 |           | 424
    END: files/Stonehenge.jpg:6196988->1014
  END: files/Stonehenge.jpg:6196988->1014
...
 6234864 | 0xffd8 SOI  
 6234866 | 0xffe1 APP1  |    1022 | Exif__II*_.___._i.._.___._________._..._
  STRUCTURE OF TIFF FILE (II): files/Stonehenge.jpg:6234876->1014
   address |    tag                                  |      type |    count |    offset | value
        10 | 0x8769 Exif.Image.ExifTag               |      LONG |        1 |           | 28
    STRUCTURE OF TIFF FILE (II): files/Stonehenge.jpg:6234876->1014
     address |    tag                                  |      type |    count |    offset | value
          30 | 0xa002 Exif.Photo.PixelXDimension       |     SHORT |        1 |           | 1620
          42 | 0xa003 Exif.Photo.PixelYDimension       |     SHORT |        1 |           | 1080
    END: files/Stonehenge.jpg:6234876->1014
  END: files/Stonehenge.jpg:6234876->1014
...
 6757985 | 0xffd9 EOI  
END: files/Stonehenge.jpg

There are three preview/thumbnails. One is in IFD1. It's in Exif.Image.JPEGInterchangeLength. There are another two which are 640x424 and 1620x1080 in the trailer of the JPG. Exiv2 only sees the one in IFD1:

771 rmills@rmillsm1:~/gnu/exiv2/team/book $ exiv2 -pp files/cr3.cr3 
772 rmills@rmillsm1:~/gnu/exiv2/team/book $ exiv2 -pp files/Stonehenge.jpg 
Preview 1: image/jpeg, 160x120 pixels, 10837 bytes
773 rmills@rmillsm1:~/gnu/exiv2/team/book $

In the file cr3.cr3 in my book, we have:

$ tvisitor -pRU files/cr3.cr3
STRUCTURE OF JP2 (crx ) FILE (MM): files/cr3.cr3
 address |   length | box  | uuid | data
       0 |       24 | ftyp |      | crx ___.crx isom 99 114 120 32 0 0 0 1 99 114 120 32 105 115 111 109
      24 |    22792 | moov |      | __PXuuid............ 0 0 80 88 117 117 105 100 133 192 182 135 130 15 17 224 129 17 244 206
  STRUCTURE OF JP2 FILE (MM): files/cr3.cr3:32->22784
         0 |    20568 | uuid | cano | ___&CNCVCanonCR3_001 0 0 0 38 67 78 67 86 67 97 110 111 110 67 82 51 95 48 48 49
    STRUCTURE OF JP2 FILE (MM): files/cr3.cr3:32->22784:24->20552
           0 |       38 | CNCV |      | CanonCR3_001/00.09.0 67 97 110 111 110 67 82 51 95 48 48 49 47 48 48 46 48 57 46 48
          38 |       92 | CCTP |      | _______.___.___.CCDT 0 0 0 0 0 0 0 1 0 0 0 3 0 0 0 24 67 67 68 84
         130 |       92 | CTBO |      | ___.___.______Y ____ 0 0 0 4 0 0 0 1 0 0 0 0 0 0 89 32 0 0 0 0
         222 |       10 | free |      | __ 0 0
         232 |      392 | CMT1 |      | II*_.___.__.._.___p. 73 73 42 0 8 0 0 0 13 0 0 1 3 0 1 0 0 0 112 23
...
         624 |     1064 | CMT2 |      | II*_.___-_..._.___.. 73 73 42 0 8 0 0 0 39 0 154 130 5 0 1 0 0 0 226 1
      STRUCTURE OF TIFF FILE (II): files/cr3.cr3:32->22784:24->20552:632->1064
       address |    tag                                  |      type |    count |    offset | value
            10 | 0x829a Exif.Photo.ExposureTime          |  RATIONAL |        1 |       482 | 1/640
...
           310 | 0xa002 Exif.Photo.PixelXDimension       |     SHORT |        1 |           | 6000
           322 | 0xa003 Exif.Photo.PixelYDimension       |     SHORT |        1 |           | 4000
...
           466 | 0xa435 Exif.Photo.0xa435                |     ASCII |       11 |      1020 | 0000000000
      END: files/cr3.cr3:32->22784:24->20552:632->1064
        1688 |     5176 | CMT3 |      | II*_.___/_._._1___B. 73 73 42 0 8 0 0 0 47 0 1 0 3 0 49 0 0 0 66 2
...
        6864 |     1816 | CMT4 |      | II*_.___.___._.___.. 73 73 42 0 8 0 0 0 1 0 0 0 1 0 4 0 0 0 2 3
...
        8680 |    11864 | THMB |      | _____._x__.=_.__.... 0 0 0 0 0 160 0 120 0 0 46 61 0 1 0 0 255 216 255 219
    END: files/cr3.cr3:32->22784:24->20552
     20568 |      108 | mvhd |      | ____..4...4.___.___. 0 0 0 0 214 228 52 142 214 228 52 142 0 0 0 1 0 0 0 1
     20676 |      484 | trak |      | ___\tkhd___...4...4. 0 0 0 92 116 107 104 100 0 0 0 7 214 228 52 142 214 228 52 142
     21160 |      584 | trak |      | ___\tkhd___...4...4. 0 0 0 92 116 107 104 100 0 0 0 7 214 228 52 142 214 228 52 142
     21744 |      600 | trak |      | ___\tkhd___...4...4. 0 0 0 92 116 107 104 100 0 0 0 7 214 228 52 142 214 228 52 142
     22344 |      440 | trak |      | ___\tkhd___...4...4. 0 0 0 92 116 107 104 100 0 0 0 7 214 228 52 142 214 228 52 142
  END: files/cr3.cr3:32->22784
   22816 |    65560 | uuid |  xmp | <?xpacket begin=-... 60 63 120 112 97 99 107 101 116 32 98 101 103 105 110 61 39 239 187 191
   88376 |   264929 | uuid | canp | _______._...PRVW____ 0 0 0 0 0 0 0 1 0 4 10 193 80 82 86 87 0 0 0 0
  353305 | 13789912 | mdat |      | _____.j....._._..... 0 0 0 0 0 210 106 216 255 216 255 219 0 132 0 6 4 4 6 4
END: files/cr3.cr3

There are two preview/thumbnails. The first is a little JPEG 120x160 pixels:

        8680 |    11864 | THMB |      | _____._x__.=_.__.... 0 0 0 0 0 160 0 120 0 0 46 61 0 1 0 0 255 216 255 219

The second is in the canon uuid/canp box at the end:

   88376 |   264929 | uuid | canp | _______._...PRVW____ 0 0 0 0 0 0 0 1 0 4 10 193 80 82 86 87 0 0 0 0

Exiv2 doesn't know about either of them because they are not in IFD1!

773 rmills@rmillsm1:~/gnu/exiv2/team/book $ exiv2 -pp files/cr3.cr3 
774 rmills@rmillsm1:~/gnu/exiv2/team/book $ 

I must be a nerd to find this interesting!

Can this be fixed? Of course. It's only work. I'll mentor a volunteer.

clanmills commented 2 years ago

@paolobenve I have spoken to @lbschenkel about his availability to work on Exiv2 and it is limited. I know you offered to get involved with localisation and asked about a Crowdin "template" or something. As a typical native English speaker, I know nothing about localisation. Can you ask Leonardo about this and I'm sure he'll help you.

dhoulder commented 2 years ago

Am I right in thinking that this is mostly a matter of patching BmffImage::boxHandler() so that it handles THMB, PRVW and the JPEG cases in the "trak"s, (as per https://github.com/lclevy/canon_cr3 ), and stashing the appropriate offsets, sizes and mime-types in nativePreviews_ ?

clanmills commented 2 years ago

Yes, David @dhoulder, I think that's what you have to do. I have never really studied the preview code carefully. I believe it's designed to recover the previews in IFD1 of the Exif metadata. It might be easy to get it to recover images from the THMB and PRVW boxes. While on vacation last week, I started to think about updating tvisitor.cpp in my book to parse these images and haven't yet done that because I worked on Exiv2 v0.27.5 RC3 this week.

If you're interested in working on this, I will work on tvisitor.cpp and study the preview code. Two heads are better than one.

dhoulder commented 2 years ago

Hi Robin @clanmills . OK, I'll have a closer look at it over the next few days. As far as I can tell, the preview extraction for most camera raw files either grabs a preview straight out of an EXIF tag value directly, or reads a file offset and length out of an EXIF tag value and then grabs the preview from that section of the file. Since CR3 files don't store preview data in EXIF tags, some other mechanism will need to be used, but fortunately it looks like this is already catered for (for PhotoShop and EPS files I think) via Image::nativePreviews_. Looks like a wet week ahead for us here in Hobart so this might be good use of my time :-)

clanmills commented 2 years ago

I'll look at this during the week. There are a couple of parts of my book which need more work. One is Image Previews and the other is the convertor code (which converts Exif metadata into XMP etc).

I think the convertor code is an absurdity. We should only deal with what's in a file. We should never cook up anything. We should be like SuperMan and deal in "Truth, Justice and the American Way".

Speak later in the week.

clanmills commented 2 years ago

I've had a look at the tvisitor.cpp code in my book and it's now happily dumping Exif thumbnails and the THMB and PRVW boxes in CR3 files. I'll submit that code on Monday and update the text in the book. I'll also have a look at the preview code in Exiv2 tomorrow.

We're off on vacation on Friday (for a week). Let's have a talk mid-week on Zoom. By Friday, I think we'll have a good plan about how to get this done.

postscript-dev commented 2 years ago

When working on the manpage, I used the Stonehenge.jpg file (from @clanmills) to demonstrate the different commands. When it came to PREVIEW IMAGES AND THUMBNAILS though, I was a bit stuck. Apart from the thumbnail, exiv2 cannot extract any other previews from the image. If you have time @dhoulder, would you consider adding previews for the Nikon JPEG? The manpage could then be updated and this would help others understand how the feature works. I am happy to update the manpage, if the previews are added.

clanmills commented 2 years ago

Thanks, Peter @postscript-dev . You can read my thoughts about previews here: https://exiv2.org/book/#5 I think the preview code is both complex and a little weak.

The CR3 code is read-only, so we don't require code to insert/delete/modify thumbnails.

I have updated the tvisitor.cpp code in my book today to beautiful locate and dump the previews in JPEG and CR3 files. And I've updated some parts of the book.

I haven't yet totally understand the machinery for reading previews. However it appears to be searching for particular tags in the Exif metadata such as Exif.Thumbnail.JPEGInterchangeFormat and Exif.Thumbnail.JPEGInterchangeFormatLength.

However the image can hold thumbnails which are not in the Exif metadata. Tomorrow, I will implement NativePreviewList& BmffImage::nativePreviews() to return the THMB and PRVW images from the CR3 file. These are JPEG images. tvisitor reveals them as follows:

.../book/build $ ./tvisitor -pR ../files/cr3.cr3
...........
        8680 |    11864 | THMB |      | _____._x__.=_.__.... 0 0 0 0 0 160 0 120 0 0 46 61 0 1 0 0 255 216 255 219
      STRUCTURE OF JPEG FILE (II): ../files/cr3.cr3:32->22784:24->20552:8704->11840
       address | marker       |  length | signature
             0 | 0xffd8 SOI  
             2 | 0xffdb DQT   |     132 | _.......................................
           136 | 0xffc0 SOF0  |      17 | ._x_...!_........ = h,w = 120,160
           155 | 0xffc4 DHT   |     418 | __........________............_.........
           575 | 0xffda SOS  
         11835 | 0xffd9 EOI  
      END: ../files/cr3.cr3:32->22784:24->20552:8704->11840
    END: ../files/cr3.cr3:32->22784:24->20552
     20568 |      108 | mvhd |      | ____..4...4.___.___. 0 0 0 0 214 228 52 142 214 228 52 142 0 0 0 1 0 0 0 1
     20676 |      484 | trak |      | ___\tkhd___...4...4. 0 0 0 92 116 107 104 100 0 0 0 7 214 228 52 142 214 228 52 142
     21160 |      584 | trak |      | ___\tkhd___...4...4. 0 0 0 92 116 107 104 100 0 0 0 7 214 228 52 142 214 228 52 142
     21744 |      600 | trak |      | ___\tkhd___...4...4. 0 0 0 92 116 107 104 100 0 0 0 7 214 228 52 142 214 228 52 142
     22344 |      440 | trak |      | ___\tkhd___...4...4. 0 0 0 92 116 107 104 100 0 0 0 7 214 228 52 142 214 228 52 142
  END: ../files/cr3.cr3:32->22784
   22816 |    65560 | uuid |  xmp | <?xpacket begin='... 60 63 120 112 97 99 107 101 116 32 98 101 103 105 110 61 39 239 187 191
   88376 |   264929 | uuid | canp | _______._...PRVW____ 0 0 0 0 0 0 0 1 0 4 10 193 80 82 86 87 0 0 0 0
  STRUCTURE OF JPEG FILE (II): ../files/cr3.cr3:88432->264865
   address | marker       |  length | signature
         0 | 0xffd8 SOI  
         2 | 0xffdb DQT   |     132 | _.......................................
       136 | 0xffc0 SOF0  |      17 | ..8.T..!_........ = h,w = 1080,1620
       155 | 0xffc4 DHT   |     418 | __........________............_.........
       575 | 0xffda SOS  
    264871 | 0xffd9 EOI  
  END: ../files/cr3.cr3:88432->264865
  353305 | 13789912 | mdat |      | _____.j....._._..... 0 0 0 0 0 210 106 216 255 216 255 219 0 132 0 6 4 4 6 4
END: ../files/cr3.cr3  

Based on my work with tvisitor.cpp today, I'm anticipating success tomorrow. I may also have time to implement NativePreviewList& JpegImage::nativePreviews().

tvisitor reveals previews which are not seen by exiv2:

$ exiv2 -pp ~/Stonehenge.jpg
Preview 1: image/jpeg, 160x120 pixels, 10837 bytes

tvisitor identifies 4 frames (JPEG/Start of Frame0 segments). So it finds the 160x120 image in the Exif metadata, the "Big Guy" (6000x4000) and two more previews (640x424 and 1620x1080) following the "Big Guy".

.../book/build $ ./tvisitor -pR ~/Stonehenge.jpg | grep SOF0
         136 | 0xffc0 SOF0  |      17 | ._x_...!_........ = h,w = 120,160
   22392 | 0xffc0 SOF0  |      17 | ....p..!_........ = h,w = 4000,6000
 6198136 | 0xffc0 SOF0  |      17 | .......!_........ = h,w = 424,640
 6236024 | 0xffc0 SOF0  |      17 | ..8.T..!_........ = h,w = 1080,1620
.../book/build $ 

David: @dhoulder My discovery today about nativePreviews() fits perfectly with your observations about Image::nativePreviews_. I'm 99% confident that I'll have a strong prototype of this on Tuesday. Do you have time on Wednesday or Thursday to talk on Zoom?

If it's wet enough in Hobart on Tuesday, you might have this working before I get out of bed on Tuesday.

dhoulder commented 2 years ago

Tomorrow, I will implement NativePreviewList& BmffImage::nativePreviews() to return the THMB and PRVW images from the CR3 file.

I was assuming that the existing base class implementation in Image::nativePreviews() already does the job, and all that is required in BmffImage::boxHandler() is to fill out the nativePreviews_ vector in much the same way that readWriteEpsMetadata() in epsimage.cpp does. Am I missing something?

clanmills commented 2 years ago

I think you're ahead of me in understanding this magic. Image::nativePreviews() is the base implementation and does exactly nothing. It returns an empty list. Today, I'm going to implement BmffImage::nativePreviews(). BmffImage::readMetadata() calls the box handler and I'll modify that to save the THMD and PRVW images (which are JPEGs) in the instance nativePreviews_. I think that'll fix it.

I got my 10 hour time change mental arithmetic wrong yesterday. 11am in England is 9pm in Hobart. Happy to make the meeting earlier if you prefer. You can edit the meeting time in Google Calendar. Kind of odd that I messed up because I have Melbourne in the world clock of my smart watch for speaking with my friend Penny.

dhoulder commented 2 years ago

OK, getting there…

david@blackbox:Downloads$ ~/devel/exiv2/exiv2/build/bin/exiv2 -pp Canon\ -\ Canon\ EOS\ R5\ -\ 3\ 2.CR3 
Preview 1: image/jpeg, 160x120 pixels, 13225 bytes
Preview 2: image/jpeg, 1620x1080 pixels, 300348 bytes
david@blackbox:Downloads$ ~/devel/exiv2/exiv2/build/bin/exiv2 -ep Canon\ -\ Canon\ EOS\ R5\ -\ 3\ 2.CR3 
david@blackbox:Downloads$ file 'Canon - Canon EOS R5 - 3 2-preview2.jpg' 
Canon - Canon EOS R5 - 3 2-preview2.jpg: JPEG image data, baseline, precision 8, 1620x1080, components 3
david@blackbox:Downloads$ file 'Canon - Canon EOS R5 - 3 2-preview1.jpg' 
Canon - Canon EOS R5 - 3 2-preview1.jpg: JPEG image data, baseline, precision 8, 160x120, components 3

Now I just need to extract the big JPEG from inside "trak" 1

clanmills commented 2 years ago

Here's the patch to recover the THMB from a CR3 file on branch 0.27-maintenance. PRVW is very similar.

As this issue is to find previews in CR3 files, it sufficient to deal with THMB and PRVW in BmffImage::boxHandler(), test it and ask @paolobenve to test it. A PR should be submitted to both the 0.27-maintenance and main branches. This change is too late for submission to Exiv2 v0.27.5 which is frozen.

@postscript-dev: A new issue should be opened concerning previews in JPEGs. The changes to jpegimage.cpp will be similar to bmpffimage.cpp. Beware that the current implementation of JpegImage::readMetadata() does not search beyond the first SOS (Start of Scan) and we should retain that behaviour for performance reasons. We should only read the whole JPEG when asked to read the previews.

This is an important moment in my life. I believe (and hope) that I've written the last line of code I will ever contribute to Exiv2.

diff --git a/src/bmffimage.cpp b/src/bmffimage.cpp
index 7b53789b..b19362dc 100644
--- a/src/bmffimage.cpp
+++ b/src/bmffimage.cpp
@@ -76,6 +76,7 @@
 #define TAG_colr 0x636f6c72 /**< "colr" */
 #define TAG_exif 0x45786966 /**< "Exif" Used by JXL*/
 #define TAG_xml  0x786d6c20 /**< "xml"  Used by JXL*/
+#define TAG_THMB 0x54484d42 /*=< "THMB" Used by CR3*/

 // *****************************************************************************
 // class member definitions
@@ -442,6 +443,25 @@ namespace Exiv2
                 }
             } break;

+            case TAG_THMB: {
+                NativePreview nativePreview;
+                int32_t       header    = 16 ;
+                nativePreview.position_ = io_->tell()+header;
+                byte buf[header];
+                io_->read(buf,header);
+                nativePreview.width_    = getShort(buf+4,bigEndian);
+                nativePreview.height_   = getShort(buf+6,bigEndian);
+                nativePreview.size_     = getLong (buf+8,bigEndian);
+                nativePreview.filter_   = "";
+                nativePreview.mimeType_ = "image/jpeg";
+                nativePreviews_.push_back(nativePreview);
+
+                if ( bTrace ) {
+                    out << Internal::stringFormat("width,height,size = %d,%d,%d"
+                                                        ,nativePreview.width_,nativePreview.height_,nativePreview.size_);
+                }
+            } break;
+
             case TAG_cmt1:
                 parseTiff(Internal::Tag::root, box_length);
                 break;
clanmills commented 2 years ago

We've both more-or-less fixed this. I don't think we need to extract the "big guy".

dhoulder commented 2 years ago

We've both more-or-less fixed this.

@clanmills Yep, your patch is pretty close to mine, unsurprisingly :grin:

I don't think we need to extract the "big guy".

For CR2 files, exiv2 does handle the big guy, and I've used this myself in the past to get the camera original JPEG. It shouldn't be hard to do, and it would be nice to keep the behaviour consistent.

I'll work on this over the next week or two and come up with PRs for main and 0.27-maintenance. I'll need to update tests, add some battle-hardening etc.

clanmills commented 2 years ago

That sounds like a plan.

We should be cautious about extracting "the big guy" because that could get us into decoding the mdat box and that's potentially involves patents. The code in bmffimage.cpp is written from the ISO specification which documents this stuff. The Exif/IPTC/XMP/ICC data is encoded as specified in the metadata standards.

If you need help with the test suite, ask on the Chat Server and somebody on the team will give you pointers. It's unfamiliar, although it's not difficult. You might be unable to add the test image to the test suite as CR3 files are usually big. Be sure to ask @paolobenve to test your code.

Happy to talk on Thursday if you wish, although it sounds as though we have our arms round this one. I'm off on vacation on Friday for a week. We're going to walk the Jurassic Coast of Dorset.

dhoulder commented 2 years ago

For anyone playing along at home: https://github.com/dhoulder/exiv2/commit/f7b5b3674eb52b85dd85d689c708ca941d854f28 Just extracts THMB and PRVW images so far. Remaining large JPEG extraction and tests are works in progress.

clanmills commented 2 years ago

That looks good, David: https://github.com/dhoulder/exiv2/commit/f7b5b3674eb52b85dd85d689c708ca941d854f28#commitcomment-57765246

I can't find my report about how I bench tested the BMFF code on big files. This is quite upsetting because I put a week's effort into that in March and documented it carefully. And now I can't find my report. I spent about 60 minutes searching for that last week and again this morning.

clanmills commented 2 years ago

@paolobenve PR #1958 has been merged into branch 'main'. Could you pull down/build the latest version of main and test it with your CR3 files. A huge thank you to @dhoulder for this contribution.

kmilos commented 2 years ago

@paolobenve PR #1958 has been merged into branch 'main'. Could you pull down/build the latest version of main and test it with your CR3 files.

Ah, it seems @paolobenve is trying to this within darktable, and we are no longer API compatible on the main branch...

paolobenve commented 2 years ago

https://discuss.pixls.us/t/error-compiling-darktable-3-6-0-with-cr3-support/27470

kmilos commented 2 years ago

@paolobenve You might have better luck w/ sticking w/ 0.27-maintenance branch - the backported https://github.com/Exiv2/exiv2/pull/1968 branch of this fix specifically.

clanmills commented 2 years ago

I didn't plan to back-port this fix into v0.27.5. However, the risk is small and the gain for @paolobenve (and other darktable users) is considerable.

I prefer not to do Exiv2 v0.27.5 RC4 because it's kind of meaningless to add this if the darktable folks can't perform a build and test with RC4. I think I'm OK with merging this into v0.27.5 and releasing it on Friday (2021-10-22).

Does anybody disagree?

kmilos commented 2 years ago

I just wanted to enable @paolobenve to test and confirm this in darktable initially, but I don't disagree, it's a very nice feature to add to the next release (when we get the confirmation) ;)

paolobenve commented 2 years ago

cheching out branch origin/mergify/bp/0.27-maintenance/pr-1958 and compiling I run into this error and much more:

[ 10%] Building CXX object src/CMakeFiles/exiv2lib_int.dir/canonmn_int.cpp.o
In file included from /home/paolo/git/exiv2/include/exiv2/metadatum.hpp:27,
                 from /home/paolo/git/exiv2/include/exiv2/tags.hpp:27,
                 from /home/paolo/git/exiv2/src/tags_int.hpp:26,
                 from /home/paolo/git/exiv2/src/tifffwd_int.hpp:26,
                 from /home/paolo/git/exiv2/src/makernote_int.hpp:25,
                 from /home/paolo/git/exiv2/src/canonmn_int.cpp:29:
/home/paolo/git/exiv2/include/exiv2/value.hpp:54:22: warning: ‘template<class> class std::auto_ptr’ is deprecated [-Wdeprecated-declarations]
   54 |         typedef std::auto_ptr<Value> AutoPtr;
      |                      ^~~~~~~~
paolobenve commented 2 years ago

branch main:

$ exiv2 --version
exiv2 1.0.0.9

is it ok? version 1 hasn't released yet

kevinbackhouse commented 2 years ago

@paolobenve: yes, use mergify/bp/0.27-maintenance/pr-1958. The auto_ptr stuff should be just warnings. All the automated checks on #1968 are passing except for the macOS builds, which are failing due to an unrelated reason that will be fixed by #1966.

kmilos commented 2 years ago

You might want to start anew with a fresh clone and that particular branch checkout. The CI passes (except a known issue on Mac), and just confirmed locally as well (mingw64):

$ git status
On branch mergify/bp/0.27-maintenance/pr-1958
Your branch is up to date with 'origin/mergify/bp/0.27-maintenance/pr-1958'.

nothing to commit, working tree clean

...

is it ok?

Nope, you should be seeing

$ ./bin/exiv2 --version
exiv2 0.27.5.3
paolobenve commented 2 years ago

I'm not clear on this: in order to test with darktable is it enough to compile exiv2 or must I recompile darktable too?

In both cases I cannot see the previews; on the command line I get the error:

[exiv2 dt_exif_get_thumbnail] /home/paolo/image.CR3: /home/paolo/image.CR3: The file contains data of an unknown image type

clanmills commented 2 years ago

@paolobenve I think we can get the darktable people involved.

@kmilos or @alexvanderberkel Can you talk with Matt (McGuire?) on the ChatServer about this? I think he (or another DT engineer) can coordinate with me to get a private build of DT+v0.27.5 with @dhoulder's code. If necessary, we'll delay releasing v0.27.5 while we get this tested.

dhoulder commented 2 years ago

@paolobenve wrote:

I'm not clear on this: in order to test with darktable is it enough to compile exiv2 or must I recompile darktable too?

I think the short answer is "recompile darktable and make sure darktable's cmake is finding the correct exiv2 headers and library" (see console output and darktable's build/CMakeFiles/CMake*.log). At the moment though, the "CR3 previews" code is only in https://github.com/Exiv2/exiv2/commits/main and that requires a little surgery on the darktable source as mentioned in https://discuss.pixls.us/t/error-compiling-darktable-3-6-0-with-cr3-support/27470

Use the exiv2 branch from the pull request that @kevinbackhouse mentions in https://github.com/Exiv2/exiv2/issues/1893#issuecomment-945837798

Alternatively, maybe just build darktable against https://github.com/Exiv2/exiv2/tree/0.27-maintenance which should at least allow you to open Cr3 files, but it won't have previews. When the cr3-preview code is backported to 0.27-maintenance, rebuild again and you should (hopefully) see CR3 previews. Keep an eye on https://github.com/Exiv2/exiv2/pull/1968

clanmills commented 2 years ago

I'm reopening this issue.

kmilos commented 2 years ago

Ok, here's my experiment:

  1. Download the current "official" darktable Windows pre-release build
  2. Import some CR3s, got skulls (no previews), as expected image
  3. Close, clear out the XMP sidecars and darktable user folder
  4. Build #1968 and replace C:\Program Files\darktable\bin\libexiv2.dll
  5. Restart dt and import the same CR3s again - still no dice w/ previews 👎

If, on the other hand, I try the current "unofficial" dt build w/ CR3 support, it looks like I get the previews even w/ exiv2 0.27.4 👍 (@MStraeten please kindly confirm you're using vanilla exiv2 0.27.4 without any patches?)

So it looks like dt might be getting previews from the actual CR3 rawspeed decoder rather than exiv2... I'll try to confirm this so we don't have to block the 0.27.5 release.

MStraeten commented 2 years ago

the darktable-insider-program development build is based on current master and this contains rawspeed without cr3 support. So this build is able to read cr3 metadata (since built with exiv2 0.27.4 with BMFF support provided by msys/mingw) but not process the image. My inofficial build is using exiv2 0.27.4 as provided with msys/mingw which contains BMFF support - but rawspeed with included pr271 to support cr3.

kmilos commented 2 years ago

Thanks for the clarification!

I have done some digging - it seems if exiv2 thumbnail extraction fails, darktable will load (+do basic process) the raw and downscale to 128x128 for the thumbnial, which is why the CR3 build w/ 0.27.4 shows them.

On the other hand, it seems #1968 still returns an empty list here.

kmilos commented 2 years ago

Mea culpa - I was building RC3 instead of #1968 🤦

The experiment w/ official pre-release (no CR3 decoding support) and libexiv2.dll replacement is successful!

image

I believe we're good to go!

clanmills commented 2 years ago

Ah, this is very good news indeed, @kmilos. Don't forget we need the cmake option -DEXIV2_ENABLE_BMFF=1 on the 0.27-maintenance branch.

With this result, perhaps we don't need @paolobenve to test.

@dhoulder is polishing the BMFF code at the moment. I'll decide on Thursday what to pull into 0.27.50 (= 0.27.5 Preview). Can you build/test that on Thursday? Then I'll build/publish 0.27.5 (= 0.27.5 GM) on Friday. How's that?

I'm happy to delay this until the weekend. So 0.27.50 on Saturday, you build/test. On Sunday 0.27.5 goes out the door!

kmilos commented 2 years ago

Don't forget we need the cmake option -DEXIV2_ENABLE_BMFF=1 on the 0.27-maintenance branch.

Of course, it's present in my local build script that tracks the msys2/mingw64 repo packaging recipe.

Can you build/test that on Thursday?

One can but only aim ;)

dhoulder commented 2 years ago

@kmilos wrote:

The experiment w/ official pre-release (no CR3 decoding support) and libexiv2.dll replacement is successful!

Good to hear!

See https://github.com/Exiv2/exiv2/issues/1961 for the polish that @clanmills was talking about. Should make the import significantly faster.

kmilos commented 2 years ago

Yeah, great job on the previews @dhoulder 👍

As far as the remaining polish goes, let's have a formal PR and review going for the 0.27-maintenance branch asap so we get some builds and testing done, thanks.

dhoulder commented 2 years ago

As far as the remaining polish goes, let's have a formal PR and review going for the 0.27-maintenance branch asap so we get some builds and testing done, thanks.

Done — see https://github.com/Exiv2/exiv2/pull/1974#issuecomment-947092701

paolobenve commented 2 years ago

[exiv2 dt_exif_get_thumbnail] /home/paolo/image.CR3: /home/paolo/image.CR3: The file contains data of an unknown image type

This darktable message I reported is actually an exiv2 error: darktable simply reports the error message it received from exiv2: "The file contains data of an unknown image type"

I think we can get the darktable people involved.

@clanmills why? it's an exiv2 stuff: that phrase comes from exiv2's src/error.cpp, in consequence of Exiv2::kerFileContainsUnknownImageType

dhoulder commented 2 years ago

@paolobenve I suspect you have linked against the wrong exiv2 library. It doesn't seem to have any support for CR3 files, let alone the new preview code. On Linux use ldd $whatever/bin/darktable to see which libexiv2.so you have linked against. On MacOS I think you use otool -L $whatever/bin/darktable

clanmills commented 2 years ago

@paolobenve If your darktable is using exiv2 as a shared object, you can inspect the library at run-time with lsof (list open files), like this:

224 rmills@rmillsm1-ubuntu:~ $ lsof | grep exiv2 | grep darktable
darktable 269031                            rmills  mem       REG                8,2  5227136    2113351 /usr/local/lib/libexiv2.so.0.27.5.1
darktable 269031 269032 gmain               rmills  mem       REG                8,2  5227136    2113351 /usr/local/lib/libexiv2.so.0.27.5.1
darktable 269031 269034 gdbus               rmills  mem       REG                8,2  5227136    2113351 /usr/local/lib/libexiv2.so.0.27.5.1
darktable 269031 269035 worker              rmills  mem       REG                8,2  5227136    2113351 /usr/local/lib/libexiv2.so.0.27.5.1
darktable 269031 269036 kicker              rmills  mem       REG                8,2  5227136    2113351 /usr/local/lib/libexiv2.so.0.27.5.1
darktable 269031 269037 worker              rmills  mem       REG                8,2  5227136    2113351 /usr/local/lib/libexiv2.so.0.27.5.1
darktable 269031 269038 worker              rmills  mem       REG                8,2  5227136    2113351 /usr/local/lib/libexiv2.so.0.27.5.1
darktable 269031 269039 worker              rmills  mem       REG                8,2  5227136    2113351 /usr/local/lib/libexiv2.so.0.27.5.1
darktable 269031 269045 lua\x20th           rmills  mem       REG                8,2  5227136    2113351 /usr/local/lib/libexiv2.so.0.27.5.1
darktable 269031 269046 pool-dark           rmills  mem       REG                8,2  5227136    2113351 /usr/local/lib/libexiv2.so.0.27.5.1
darktable 269031 269047 threaded-           rmills  mem       REG                8,2  5227136    2113351 /usr/local/lib/libexiv2.so.0.27.5.1
darktable 269031 269048 darktable           rmills  mem       REG                8,2  5227136    2113351 /usr/local/lib/libexiv2.so.0.27.5.1
225 rmills@rmillsm1-ubuntu:~ $

However, you'll have to build a variant of exiv2 from branch 0.27-maintenance with BMFF enabled and David's (@dhoulder) CR3/preview code. I intend to add David's code to 0.27-maintenance tomorrow (Thursday) and Milos (@kmilos) will build and test it with darktable. We expect everything to proceed smoothly and Exiv2 v0.27.5 will be released by Sunday 2021-10-24.

After v0.27.5 has been released, you can download the Linux shared object with BMFF enabled from exiv2.org. You may need a new build of darktable built with v0.27.5 and you should request that from darktable next week.

kmilos commented 2 years ago

I just came across a different error: [exiv2] Invalid native preview position or size. - either there are some really broken files, or exiv2 is not parsing something correctly...

Take e.g. the CR3s from here:

$ ./exiv2 -pp AJKL0637.CR3
Preview 1: image/jpeg, 2x320 pixels, 14090239 bytes
Preview 2: image/jpeg, 1620x1080 pixels, 493136 bytes

That preview 1 looks very messed up: 14MB for a 2x320?

My test above was on some random CR3s from raw.pixls.us that worked out:

$ ./exiv2 -pp Canon\ -\ Canon\ EOS-1D\ X\ Mark\ III\ -\ 3_2.CR3
Preview 1: image/jpeg, 160x120 pixels, 22223 bytes
Preview 2: image/jpeg, 1620x1080 pixels, 458613 bytes

What's extra weird is that they're both from the same 1D X III model! Corruption in file transfer, or something else going on?

clanmills commented 2 years ago

@dhoulder Can you have a look please? I have a Euphonium lesson about to start (09:30 BST). I'll look at this with tvisitor.cpp after my lesson.

clanmills commented 2 years ago

tvisitor.cpp parses Canon - Canon EOS R6 - 3_2.CR3 correctly and finds two thumbnails: 120x160 and 1080,1620.

tvisitor.cpp fails to read AJKL0637.CR3. Gives up on 'corrupted file'. It might be corrupted. It's more likely that the parser has made an error. Maybe we need more time to work on CR3/preview.

@dhoulder I have 170 CR3 files on the MacMini. 164 are from pixels.us (including Canon - Canon EOS R6 - 3_2.CR3), 4 from Gordon Laing and 2 from libheif (AJKL0637.CR3 and AJKL0638.CR3). I can provide you with ssh access to the MacMini. Can we discusses ssh access on email, please.

clanmills commented 2 years ago

I'm opening this again. I think it's getting closed by the linked PR. Rats. Strong willed Robots!