openpreserve / jhove

File validation and characterisation.
http://jhove.openpreservation.org
Other
164 stars 78 forks source link

Problem with TIFF file validation #847

Closed perdrix52 closed 1 year ago

perdrix52 commented 1 year ago

I'm trying to force a TIFF file I write to have the main IFD at the beginning rather then the end.

So when I have written all the TIFF tags to the main IFD, I switch to create the EXIF IFD, populate that and then issue:

        TIFFSetDirectory(m_tiff, 0);
        TIFFSetField(m_tiff, TIFFTAG_EXIFIFD, dir_offset_EXIF);
        //TIFFCheckpointDirectory(m_tiff);
        TIFFWriteDirectory(m_tiff);
        TIFFSetDirectory(m_tiff, 0);

then I write the image data.

If just TIFFClose at that point jHove is happy with the file:

C:\Users\amonra\jhove>jhove "C:\Users\amonra\Documents\Astrophotography\DSS Test Images\Covington\TestImages\Bias\Master
Offset_ISO200.tif"
Jhove (Rel. 1.26.1, 2022-07-14)
 Date: 2023-04-13 10:17:09 BST
 RepresentationInformation: C:\Users\amonra\Documents\Astrophotography\DSS Test Images\Covington\TestImages\Bias\MasterO
ffset_ISO200.tif
  ReportingModule: TIFF-hul, Rel. 1.9.3 (2022-04-22)
  LastModified: 2023-04-13 06:46:48 BST
  Size: 36040428
  Format: TIFF
  Version: 6.0
  Status: Well-Formed and valid
  SignatureMatches:
   TIFF-hul
  InfoMessage: Unknown TIFF IFD tag: 340
   ID: TIFF-HUL-12
   Offset: 36040184
     :
  MIMEtype: image/tiff
  Profile: TIFF/IT-MP (ISO 12639:1998)
  TIFFMetadata:
   ByteOrder: little-endian
   IFDs:
    Number: 2
    IFD:
     Offset: 36039970
     Type: TIFF
     Entries:
      NisoImageMetadata:
       FormatName: image/tiff
       ByteOrder: little_endian
       CompressionScheme: uncompressed
       ImageWidth: 5202
       ImageHeight: 3464     

    etc ...

          TIFFEPProperties:
       CFARepeatPatternDim: 2, 2
       CFAPattern: 1, 2, 0, 1
    IFD:
     Offset: 418
     Type: Exif
     Entries:
      ExifVersion: 0231
      FlashpixVersion: 0100
      ExposureTime: 0
      FNumber: 4
      ISOSpeedRatings: 200, 200, 254
      FocalPlaneResolutionUnit: inches
      CFAPattern: 2, 0, 2, 0, 1, 2, 0, 1
      Contrast: normal

If I add a call to TIFFWriteDirectory immediately before the call to TIFFClose, jHove detects as an ascii bytestream, and if I invoke it with -m TIFF-hul, I get:

C:\Users\amonra\jhove>jhove -m TIFF-hul "C:\Users\amonra\Documents\Astrophotography\DSS Test Images\Covington\TestImages\Bias\MasterOffset_ISO200.tif"
 | more
Jhove (Rel. 1.26.1, 2022-07-14)
 Date: 2023-04-13 10:30:14 BST
 RepresentationInformation: C:\Users\amonra\Documents\Astrophotography\DSS Test Images\Covington\TestImages\Bias\MasterOffset_ISO200.tif
  ReportingModule: TIFF-hul, Rel. 1.9.3 (2022-04-22)
  LastModified: 2023-04-13 10:26:44 BST
  Size: 36039970
  Format: TIFF
  Version: 6.0
  Status: Not well-formed
  SignatureMatches:
   TIFF-hul
  InfoMessage: Unknown TIFF IFD tag: 340
   ID: TIFF-HUL-12
   Offset: 222
  InfoMessage: Unknown TIFF IFD tag: 341
   ID: TIFF-HUL-12
   Offset: 234
  InfoMessage: Unknown TIFF IFD tag: 50002
   ID: TIFF-HUL-12
   Offset: 282
  InfoMessage: Unknown TIFF IFD tag: 50006
   ID: TIFF-HUL-12
   Offset: 294
  InfoMessage: Unknown TIFF IFD tag: 50007
   ID: TIFF-HUL-12
   Offset: 306
  InfoMessage: Unknown TIFF IFD tag: 50008
   ID: TIFF-HUL-12
   Offset: 318
  InfoMessage: Unknown TIFF IFD tag: 50009
   ID: TIFF-HUL-12
   Offset: 330
  InfoMessage: Unknown TIFF IFD tag: 50010
   ID: TIFF-HUL-12
   Offset: 342
  ErrorMessage: Unknown data type: Type = 30068, Tag = 25449
   ID: TIFF-HUL-3
   Offset: 422
  ErrorMessage: Unknown data type: Type = 8307, Tag = 25970
   ID: TIFF-HUL-3
   Offset: 434
  ErrorMessage: Tag 19752 out of sequence
   ID: TIFF-HUL-2
   Offset: 444
  ErrorMessage: Unknown data type: Type = 25701, Tag = 19752
   ID: TIFF-HUL-3
   Offset: 446
  ErrorMessage: Unknown data type: Type = 10606, Tag = 24937
   ID: TIFF-HUL-3
   Offset: 458
  ErrorMessage: Tag 0 out of sequence
   ID: TIFF-HUL-2
   Offset: 468

       --- LOTS MORE like that

Is this me or jHove? I can process the file quite happily - and it opens in Photoshop/GIMP/IfranView etc. just fine ...

Thanks David

carlwilson commented 1 year ago

Hi @perdrix52 I'd like to reproduce this but might need a little more information. Which library are you using to create the files please? Is it possible that you might be able to attach two as examples here please?

perdrix52 commented 1 year ago

I'm using libtiff 4.5.0 I can provide source code and a link to a LARGE file (sorry my software processes large images.).

https://www.dropbox.com/s/moe5rtp4wg1ra21/MasterOffset_ISO200.tif?dl=1

Source code attached: the mfs you want to look at are CTIFFWriter::Open(), CTIFFWriter::Write() and CTIFFWriter::Close() TIFFUtil.zip

I can read tags in the base IFD find but failed to TIFFReadEXIFDirectory() so the file is definitely somewhat damaged

Thanks David

perdrix52 commented 1 year ago

Never mind I worked it out - writing the data added two additional fields to the base IFD - specifically:

StripOffsets (3 Long): 514, 16771762, 33543010 StripByteCounts (3 Long): 16771248, 16771248, 2496960

They were added by the libtiff code internally (not by my code).

Which meant that the base IFD was larger after writing the data than it was before. Thus TIFFWriteDirectory over-wrote the EXIF IFD resulting in the mess above!!! Therefore there's no way to avoid the base IFD being at the end of the file.

carlwilson commented 1 year ago

That makes sense. Thanks for the follow-up info and explanation.