Closed karenhanson closed 1 year ago
Patch and project coverage have no change.
Comparison is base (
9ff7d97
) 46.87% compared to head (dc26346
) 46.87%.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.
I've finished with the PDF module contributions now and will be looking at this next. Are there any available test files for this @karenhanson ?
I've tested this, and the only "objection" I have is that the error list is repeated for each run so there are far too many versions of the same error. I have (half) a plan to combat this some but would be interested to hear your thoughts @karenhanson
Sorry for the delay, and thanks for taking a look at this - I hadn't noticed the duplicated errors. Just to add our offline conversation to the PR's record... I think the approach you emailed me (paraphrased: if (a) ID error identical (b) offset identical and >-1 (i.e. has a loc) (c) message identical and (d) sub-message identical; then don't print error as it is a duplicate) sounds correct - unless there are likely to be duplicates with no loc, which seems like a bigger issue. If you want me to do a manual re-test after any changes or if there is anything else I can do to help, please let me know.
I came across a JPEG file that appears to have a corrupted
subIFD
in its first IFD - I assume it's corrupted because the number of fields coming up for it is 0 and the next offset reference points back to byte 8, which is also the first IFD offset (its parent). The result is an ever-loopingparseIFDChain()
that eventually crashes out with a stackOverflowError:The code is not detecting that it is repeatedly visiting the same IFD, and so I copied the solution here into the
parseIFDChain()
method: https://github.com/openpreserve/jhove/blob/1bb234219c08a0bdce8f15dd003c3fb920e51814/jhove-modules/tiff-hul/src/main/java/edu/harvard/hul/ois/jhove/module/TiffModule.java#L1190-L1192 This way I can re-use the error message and apply the same logic. So this fixes the error by exiting after 50 loops. It might be more elegant to check for duplication in the offset or next values in the list of IFDs, but I wasn't sure if there might be unintended consequences with that route since I'm not familiar with the TIFF Exif format (beyond what I figured out today while debugging). Let me know if you see a better way to fix this and I can try to work on it some more.