openpreserve / jhove

File validation and characterisation.
http://jhove.openpreservation.org
Other
171 stars 79 forks source link

Jhove validates truncated TIFF using BYTESTREAM module (even though it recognises it as a TIFF) #316

Open bitsgalore opened 6 years ago

bitsgalore commented 6 years ago

Dev Effort

2D

Description

Attached below is a truncated TIFF file (zipped). If I do this with jhove 1.18.1:

jhove truncated.tiff

Result:

[Fatal Error] :155:4: XML document structures must start and end within the same entity.
Jhove (Rel. 1.18.1, 2017-11-30)
 Date: 2018-03-15 17:00:06 CET
 RepresentationInformation: truncated.tiff
  ReportingModule: BYTESTREAM, Rel. 1.3 (2007-04-10)
  LastModified: 2018-03-15 16:50:14 CET
  Size: 65536
  Format: bytestream
  Status: Well-Formed and valid
  SignatureMatches:
   TIFF-hul
  MIMEtype: application/octet-stream

For some reason JHOVE uses the BYTESTREAM module for validating this file, even though it reports a signature match for the TIFF-hull module! If I explicitly tell JHOVE to use TIFF-hul the result is as expected:

jhove -m TIFF-hul truncated.tiff

Result:

[Fatal Error] :155:4: XML document structures must start and end within the same entity.
Jhove (Rel. 1.18.1, 2017-11-30)
 Date: 2018-03-15 17:01:19 CET
 RepresentationInformation: truncated.tiff
  ReportingModule: TIFF-hul, Rel. 1.8 (2017-05-11)
  LastModified: 2018-03-15 16:50:14 CET
  Size: 65536
  Format: TIFF
  Status: Not well-formed
  SignatureMatches:
   TIFF-hul
  ErrorMessage: Value offset not word-aligned: 7177
   Offset: 198
  MIMEtype: image/tiff

(As an aside, JHOVE reports a Fatal Error which appears to be due to some malformed embedded XML metadata. Don't know if this is related to this bug, but in any case I think JHOVE should be able to handle this without causing fatal exceptions).

truncated.tiff.zip

MartinSpeller commented 4 years ago

Jhove validates truncated TIFF using BYTESTREAM module (even though it recognises it as a TIFF) #316 - Assigned to TBA