Open ross-spencer opened 3 years ago
Another example I've been able to recreate the problem with.
I encountered the same error with a PDF that was signed with docusign via (I'm pretty sure) UC of Santa Cruz's docusign portal (https://its.ucsc.edu/docusign/index.html). Unfortunately, I cannot share the PDF as there is identifying information on the document, but I am copying the JHOVE output below:
Jhove (Rel. 1.24.1, 2020-03-16)
Date: 2022-03-24 12:45:38 EDT
RepresentationInformation: ya-rg2294-Lewites-release.pdf
ReportingModule: PDF-hul, Rel. 1.12.2 (2019-12-10)
LastModified: 2022-03-24 11:32:22 EDT
Size: 595132
Format: PDF
Version: 1.5
Status: Not well-formed
SignatureMatches:
PDF-hul
ErrorMessage: Unexpected exception java.lang.NullPointerException
ID: PDF-HUL-94
MIMEtype: application/pdf
PDFMetadata:
Objects: 113
FreeObjects: 3
IncrementalUpdates: 2
DocumentCatalog:
PageLayout: SinglePage
PageMode: UseNone
Info:
Title:
Author:
Subject:
ID: 0x62623566356537642d626531322d346562322d626237372d303237313039346565326232, 0x4c53096e3584a1e81fb1b63f1331aae9
Filters:
FilterPipeline: FlateDecode
FilterPipeline: DCTDecode
Images:
Image:
NisoImageMetadata:
FormatName: image/jpg
CompressionScheme: JPEG
ImageWidth: 2550
ImageHeight: 476
BitsPerSample: 8
BitsPerSampleUnit: integer
Intent: Perceptual
Interpolate: true
Image:
NisoImageMetadata:
FormatName: image/jpg
CompressionScheme: JPEG
ImageWidth: 2550
ImageHeight: 476
BitsPerSample: 8
BitsPerSampleUnit: integer
Intent: Perceptual
Interpolate: true
Fonts:
Type0:
Font:
BaseFont: TimesNewRomanPSMT
Encoding: Identity-H
ToUnicode: true
Font:
BaseFont: SymbolMT
Encoding: Identity-H
ToUnicode: true
TrueType:
Font:
BaseFont: TimesNewRomanPSMT
FirstChar: 32
LastChar: 122
FontDescriptor:
FontName: TimesNewRomanPSMT
Flags: Nonsymbolic
FontBBox: -568, -216, 2046, 693
Encoding: WinAnsiEncoding
Font:
BaseFont: UKHNYQ+Georgia
FontSubset: true
FirstChar: 32
LastChar: 122
FontDescriptor:
FontName: UKHNYQ+Georgia
Flags: Nonsymbolic
FontBBox: -490, -303, 1797, 1075
FontFile2: true
Encoding: MacRomanEncoding
Font:
BaseFont: BUXLXF+TimesNewRomanPSMT
FontSubset: true
FirstChar: 33
LastChar: 46
FontDescriptor:
FontName: BUXLXF+TimesNewRomanPSMT
Flags: Symbolic
FontBBox: -568, -307, 2046, 1039
FontFile2: true
ToUnicode: true
Font:
BaseFont: ArialMT
FirstChar: 32
LastChar: 32
FontDescriptor:
FontName: ArialMT
Flags: Nonsymbolic
FontBBox: -665, -210, 2000, 728
Encoding: WinAnsiEncoding
Font:
BaseFont: PCYGQU+Calibri
FontSubset: true
FirstChar: 33
LastChar: 33
FontDescriptor:
FontName: PCYGQU+Calibri
Flags: Symbolic
FontBBox: -503, -313, 1240, 1026
FontFile2: true
ToUnicode: true
Font:
BaseFont: UVDZOW+TimesNewRomanPSMT
FontSubset: true
FirstChar: 33
LastChar: 93
FontDescriptor:
FontName: UVDZOW+TimesNewRomanPSMT
Flags: Symbolic
FontBBox: -568, -307, 2046, 1039
FontFile2: true
ToUnicode: true
Font:
BaseFont: VZTQZF+TimesNewRomanPS-BoldMT
FontSubset: true
FirstChar: 33
LastChar: 41
FontDescriptor:
FontName: VZTQZF+TimesNewRomanPS-BoldMT
Flags: Symbolic
FontBBox: -558, -328, 2000, 1055
FontFile2: true
ToUnicode: true
Font:
BaseFont: TimesNewRomanPS-BoldMT
FirstChar: 32
LastChar: 121
FontDescriptor:
FontName: TimesNewRomanPS-BoldMT
Flags: Nonsymbolic
FontBBox: -558, -216, 2000, 677
Encoding: WinAnsiEncoding
CIDFontType2:
Font:
BaseFont: TimesNewRomanPSMT
CIDSystemInfo:
Registry: Adobe
Registry: Identity
Supplement: 0
FontDescriptor:
FontName: TimesNewRomanPSMT
Flags: Nonsymbolic
FontBBox: -568, -216, 2046, 693
FontFile2: true
Font:
BaseFont: SymbolMT
CIDSystemInfo:
Registry: Adobe
Registry: Identity
Supplement: 0
FontDescriptor:
FontName: SymbolMT
Flags: Nonsymbolic
FontBBox: 0, -216, 1113, 693
FontFile2: true
XMP:
This issue should now be fixed in the current integration
branch, if any of you want to confirm before the next major release.
@carlwilson I think this can be closed now, unless you usually wait for a proper release to include related fixes?
I'm leaving closing issues until the final build is ready then I'll run down and double-test them just to be sure.
@carlwilson Did you have time to double-test yet with the new 1.28 release (thanks!)? As I said I believe this issue could now be closed and is the only one of the linked issues in the release notes that is still open.
@prettybits I didn't notice this fix, but thanks for looking at it. From my perspective, I can see the error is no longer occurring for the files attached above. Tested on openjdk version "11.0.19" 2023-04-18
. PDF-HUL 1.12.4
.
Apologies all, I've still got to triage the open errors and test. It's a day's work and I'll be doing it ASAP, realistically in the next 3 weeks.
Attached are two files exhibiting the same problem. They were created using the docusign demo: https://secure.docusign.com/demo and were created using its two export methods.
For all intents and purposes it looks as if the validation completes, and the validation result is "Not well formed". This seems to be because of the error message raised:
ErrorMessage: Unexpected exception java.lang.NullPointerException
.The results are below:
With logging turned on I am not seeing any other confirmation of the error, or what's causing it, i.e. no stack trace. The available log lines for both files are:
While similar to https://github.com/openpreserve/jhove/issues/256 - the files in #256 fail with JHOVE reporting that "Validation ended prematurely due to an unhandled exception." here the validation completes but contains the null pointer exception.
It might be necessary for others to confirm this issue for these two files.
Currently using:
docusign-summary-examples.zip