openpreserve / jhove

File validation and characterisation.
http://jhove.openpreservation.org
Other
171 stars 79 forks source link

Is is an Error or Information message? And can more business logic be made available? #866

Open RvanVeenendaal opened 1 year ago

RvanVeenendaal commented 1 year ago

Would it be possible to add a severity indication of 'Error' or 'Information' to the JHOVE Hul wiki pages?

At the National Archives of the Netherlands, we're working on a solution for sharing information about JHOVE error messages. The idea is that organizations can share their 'further analysis', 'preservation planning' and 'preservation action' work when they encountered JHOVE error messages.

It is great that the JHOVE Hul wiki pages have sections for Id, Message, Details, Impact, References and Remediation. That is useful 'common knowledge' information to link to. What would also be very helpful is to make explicit which error messages are in fact information messages. That knowledge has been programmed into the source code, so it is available.

For the PDF Hul the distinction between information message and error message is made explicit in the errormessages.properties file. For most other Huls, this information can only be gathered from the JHOVE output or the JHOVE source code (where e.g. a new InfoMessage or a new ErrorMessage is created).

If this severity information was included in the Hul wiki pages, it would help people prioritize which errors to investigate first.

NB This information is available in the JHOVE output, where people will probably encounter it first. But that thinking implies that we as a community will only ever be able to prioritize which errors to investigate if there is a test corpus that yields all possible JHOVE errors. Also: why hide such valuable business logic away in the source code?

NB It might even be possible to make explicit if an error results in a file being invalid and/or not well-formed. Files can have multiple errors, so knowing which ones contributed to which state might be useful for prioritizing error investigations. And again: why hide away this valuable business logic in the source code?

RvanVeenendaal commented 1 year ago

I may have been hasty submitting an issue, as there is a "Type:XXXXMessage" in the Details sections. So I stand corrected: this business logic is already made available on the HUL wiki pages. But perhaps the other part of the issue - invalid/well-formed - can be taken into consideration?

carlwilson commented 1 year ago

The "wish item" for the Wiki is appreciated @RvanVeenendaal and we will take a look once the next release is ready.