openpreserve / odf-validator

Open source Open Document Format (ODF) validation
http://odf.openpreservation.org/
BSD 3-Clause "New" or "Revised" License
4 stars 0 forks source link

Validator crashes (0.14 CLI MAN-4) #194

Open maria-messerschmidt opened 1 month ago

maria-messerschmidt commented 1 month ago

AT012.ods

When validating the attached file with profile, the validator crashes. I am expecting MAN-4 and POL_2.

APP-1: [INFO] Validating C:\Users\maria\Desktop\014test\AT012.ods. APP-4: [INFO] Validation report for C:\Users\maria\Desktop\014test\AT012.ods. DOC-2: [INFO] package OpenDocument version 1.3 detected. DOC-3: [INFO] mimetype OpenDocument MIMETYPE application/vnd.oasis.opendocument.spreadsheet detected MAN-4: [ERROR] META-INF\manifest.xml The manifest SHALL contain an entry for every file in the package, manifest file entry styles.xml has no corresponding zip entry. NOT VALID, 1 errors, 0 warnings and 2 info messages.

java.lang.NullPointerException: Parameter toTest can not be null. at java.base/java.util.Objects.requireNonNull(Objects.java:259) at org.openpreservation.format.xml.XmlParser.parse(XmlParser.java:104) at org.openpreservation.odf.validation.rules.MacroRule.checkOdfScriptXml(MacroRule.java:63) at org.openpreservation.odf.validation.rules.MacroRule.check(MacroRule.java:46) at org.openpreservation.odf.validation.rules.ProfileImpl.getRulesetMessages(ProfileImpl.java:62) at org.openpreservation.odf.validation.rules.ProfileImpl.check(ProfileImpl.java:52) at org.openpreservation.odf.apps.CliValidator.profileReport(CliValidator.java:87) at org.openpreservation.odf.apps.CliValidator.call(CliValidator.java:58) at org.openpreservation.odf.apps.CliValidator.call(CliValidator.java:34) at picocli.CommandLine.executeUserObject(CommandLine.java:2041) at picocli.CommandLine.access$1500(CommandLine.java:148) at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2461) at picocli.CommandLine$RunLast.handle(CommandLine.java:2453) at picocli.CommandLine$RunLast.handle(CommandLine.java:2415) at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2273) at picocli.CommandLine$RunLast.execute(CommandLine.java:2417) at picocli.CommandLine.execute(CommandLine.java:2170) at org.openpreservation.odf.apps.CliValidator.main(CliValidator.java:98)

AT031.ods also has MAN-4 (and several other errors) and this also crashes: AT031.ods

APP-1: [INFO] Validating C:\Users\maria\Desktop\014test\AT031.ods. APP-4: [INFO] Validation report for C:\Users\maria\Desktop\014test\AT031.ods. DOC-2: [INFO] package OpenDocument version 1.3 detected. PKG-5: [ERROR] META-INF\ny.xml An OpenDocument Package SHALL only contain the "META-INF/manifest.xml" and files containg the term "signatures" in their name in the "META-INF" folder. File META-INF/ny.xml does not meet this criteria. MIM-4: [ERROR] mimetype An OpenDocument Package SHALL contain a mimetype file IF the manifest contains a element whose manifest:full-path attribute has the value "/". MAN-6: [INFO] META-INF\manifest.xml The OpenDocument Package manifest NEED NOT contain entries for file paths starting with META-INF/, META-INF/manifest. MAN-6: [INFO] META-INF\manifest.xml The OpenDocument Package manifest NEED NOT contain entries for file paths starting with META-INF/, META-INF/signatures. MAN-4: [ERROR] META-INF\manifest.xml The manifest SHALL contain an entry for every file in the package, manifest file entry Thumbnails/thumbnail.png has no corresponding zip entry. MAN-3: [ERROR] META-INF\manifest.xml An OpenDocument Package manifest SHALL NOT contain a file entry the mimetype file. MAN-1: [ERROR] extra.xml The manifest SHALL contain an entry for every file in the package, zip entry extra.xml has no corresponding manifest file entry. PKG-7: [WARNING] Thumbnails\thumbnail.png An OpenDocument Package SHOULD contain a preview image Thumbnails/thumbnail.png. PKG-2: [ERROR] styles.xml All files contained in the Zip file shall be non compressed (STORED) or compressed using the "deflate" (DEFLATED) algorithm. Zip entry styles.xml is compressed with an unknown algorithm. NOT VALID, 6 errors, 1 warnings and 3 info messages.

java.lang.NullPointerException: Parameter toTest can not be null. at java.base/java.util.Objects.requireNonNull(Objects.java:259) at org.openpreservation.format.xml.XmlParser.parse(XmlParser.java:104) at org.openpreservation.odf.validation.rules.MacroRule.checkOdfScriptXml(MacroRule.java:63) at org.openpreservation.odf.validation.rules.MacroRule.check(MacroRule.java:46) at org.openpreservation.odf.validation.rules.ProfileImpl.getRulesetMessages(ProfileImpl.java:62) at org.openpreservation.odf.validation.rules.ProfileImpl.check(ProfileImpl.java:52) at org.openpreservation.odf.apps.CliValidator.profileReport(CliValidator.java:87) at org.openpreservation.odf.apps.CliValidator.call(CliValidator.java:58) at org.openpreservation.odf.apps.CliValidator.call(CliValidator.java:34) at picocli.CommandLine.executeUserObject(CommandLine.java:2041) at picocli.CommandLine.access$1500(CommandLine.java:148) at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2461) at picocli.CommandLine$RunLast.handle(CommandLine.java:2453) at picocli.CommandLine$RunLast.handle(CommandLine.java:2415) at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2273) at picocli.CommandLine$RunLast.execute(CommandLine.java:2417) at picocli.CommandLine.execute(CommandLine.java:2170) at org.openpreservation.odf.apps.CliValidator.main(CliValidator.java:98)

Interestingly, AT029.ods also has MAN-4 error (and several others), but this does not crash.

APP-1: [INFO] Validating C:\Users\maria\Desktop\014test\AT029.ods. APP-4: [INFO] Validation report for C:\Users\maria\Desktop\014test\AT029.ods. PKG-5: [ERROR] META-INF\schema An OpenDocument Package SHALL only contain the "META-INF/manifest.xml" and files containg the term "signatures" in their name in the "META-INF" folder. File META-INF/schema does not meet this criteria. XML-4: [ERROR] META-INF\schema Not a valid XML document. Validation exception at line 32 and column 55: attribute "style:hopla" not allowed here; expected attribute "style:font-adornments", "style:font-charset", "style:font-family-generic", "style:font-pitch", "style:name", "svg:accent-height", "svg:alphabetic", "svg:ascent", "svg:bbox", "svg:cap-height", "svg:descent", "svg:font-family", "svg:font-size", "svg:font-stretch", "svg:font-style", "svg:font-variant", "svg:font-weight", "svg:hanging", "svg:ideographic", "svg:mathematical", "svg:overline-position", "svg:overline-thickness", "svg:panose-1", "svg:slope", "svg:stemh", "svg:stemv", "svg:strikethrough-position", "svg:strikethrough-thickness", "svg:underline-position", "svg:underline-thickness", "svg:unicode-range", "svg:units-per-em", "svg:v-alphabetic", "svg:v-hanging", "svg:v-ideographic", "svg:v-mathematical", "svg:widths" or "svg:x-height". XML-4: [ERROR] META-INF\schema Not a valid XML document. Validation exception at line 32 and column 55: element "style:font-face" missing required attribute "style:name". DOC-2: [INFO] package OpenDocument version 1.3 detected. PKG-5: [ERROR] META-INF\ny.xml An OpenDocument Package SHALL only contain the "META-INF/manifest.xml" and files containg the term "signatures" in their name in the "META-INF" folder. File META-INF/ny.xml does not meet this criteria. PKG-5: [ERROR] META-INF\parse An OpenDocument Package SHALL only contain the "META-INF/manifest.xml" and files containg the term "signatures" in their name in the "META-INF" folder. File META-INF/parse does not meet this criteria. XML-3: [ERROR] META-INF\parse Not a well formed XML document. XML parsing exception at line 1 and column 1: Content is not allowed in prolog.. MIM-1: [ERROR] mimetype The "mimetype" file SHALL be the first file of the zip file. MIM-2: [ERROR] mimetype The "mimetype" file SHALL NOT be compressed. MIM-3: [ERROR] mimetype The "mimetype" file SHALL NOT use an 'extra field' in its header. DOC-3: [INFO] mimetype OpenDocument MIMETYPE application/vnd.oasis.opendocument.spreadsheet detected MIM-5: [ERROR] META-INF\manifest.xml An OpenDocument Package mimetype file content SHALL be equal to the manifest:media-type attribute of the manifest element whose manifest:full-path attribute has the value "/". MAN-6: [INFO] META-INF\manifest.xml The OpenDocument Package manifest NEED NOT contain entries for file paths starting with META-INF/, META-INF/signatures. MAN-4: [ERROR] META-INF\manifest.xml The manifest SHALL contain an entry for every file in the package, manifest file entry Thumbnails/thumbnail.png has no corresponding zip entry. MAN-3: [ERROR] META-INF\manifest.xml An OpenDocument Package manifest SHALL NOT contain a file entry the mimetype file. MAN-2: [ERROR] META-INF\manifest.xml An OpenDocument Package manifest SHALL NOT contain a file entry for itself. MAN-1: [ERROR] extra.xml The manifest SHALL contain an entry for every file in the package, zip entry extra.xml has no corresponding manifest file entry. PKG-7: [WARNING] Thumbnails\thumbnail.png An OpenDocument Package SHOULD contain a preview image Thumbnails/thumbnail.png. PKG-2: [ERROR] styles.xml All files contained in the Zip file shall be non compressed (STORED) or compressed using the "deflate" (DEFLATED) algorithm. Zip entry styles.xml is compressed with an unknown algorithm. NOT VALID, 15 errors, 1 warnings and 3 info messages.

APP-5: [INFO] DNA ODF Spreadsheets Preservation Specification Profile report for C:\Users\maria\Desktop\014test\AT029.ods. PKG-5: [ERROR] META-INF\schema An OpenDocument Package SHALL only contain the "META-INF/manifest.xml" and files containg the term "signatures" in their name in the "META-INF" folder. File META-INF/schema does not meet this criteria. XML-4: [ERROR] META-INF\schema Not a valid XML document. Validation exception at line 32 and column 55: attribute "style:hopla" not allowed here; expected attribute "style:font-adornments", "style:font-charset", "style:font-family-generic", "style:font-pitch", "style:name", "svg:accent-height", "svg:alphabetic", "svg:ascent", "svg:bbox", "svg:cap-height", "svg:descent", "svg:font-family", "svg:font-size", "svg:font-stretch", "svg:font-style", "svg:font-variant", "svg:font-weight", "svg:hanging", "svg:ideographic", "svg:mathematical", "svg:overline-position", "svg:overline-thickness", "svg:panose-1", "svg:slope", "svg:stemh", "svg:stemv", "svg:strikethrough-position", "svg:strikethrough-thickness", "svg:underline-position", "svg:underline-thickness", "svg:unicode-range", "svg:units-per-em", "svg:v-alphabetic", "svg:v-hanging", "svg:v-ideographic", "svg:v-mathematical", "svg:widths" or "svg:x-height". XML-4: [ERROR] META-INF\schema Not a valid XML document. Validation exception at line 32 and column 55: element "style:font-face" missing required attribute "style:name". DOC-2: [INFO] package OpenDocument version 1.3 detected. PKG-5: [ERROR] META-INF\ny.xml An OpenDocument Package SHALL only contain the "META-INF/manifest.xml" and files containg the term "signatures" in their name in the "META-INF" folder. File META-INF/ny.xml does not meet this criteria. PKG-5: [ERROR] META-INF\parse An OpenDocument Package SHALL only contain the "META-INF/manifest.xml" and files containg the term "signatures" in their name in the "META-INF" folder. File META-INF/parse does not meet this criteria. XML-3: [ERROR] META-INF\parse Not a well formed XML document. XML parsing exception at line 1 and column 1: Content is not allowed in prolog.. MIM-1: [ERROR] mimetype The "mimetype" file SHALL be the first file of the zip file. MIM-2: [ERROR] mimetype The "mimetype" file SHALL NOT be compressed. MIM-3: [ERROR] mimetype The "mimetype" file SHALL NOT use an 'extra field' in its header. DOC-3: [INFO] mimetype OpenDocument MIMETYPE application/vnd.oasis.opendocument.spreadsheet detected MIM-5: [ERROR] META-INF\manifest.xml An OpenDocument Package mimetype file content SHALL be equal to the manifest:media-type attribute of the manifest element whose manifest:full-path attribute has the value "/". MAN-6: [INFO] META-INF\manifest.xml The OpenDocument Package manifest NEED NOT contain entries for file paths starting with META-INF/, META-INF/signatures. MAN-4: [ERROR] META-INF\manifest.xml The manifest SHALL contain an entry for every file in the package, manifest file entry Thumbnails/thumbnail.png has no corresponding zip entry. MAN-3: [ERROR] META-INF\manifest.xml An OpenDocument Package manifest SHALL NOT contain a file entry the mimetype file. MAN-2: [ERROR] META-INF\manifest.xml An OpenDocument Package manifest SHALL NOT contain a file entry for itself. MAN-1: [ERROR] extra.xml The manifest SHALL contain an entry for every file in the package, zip entry extra.xml has no corresponding manifest file entry. PKG-7: [WARNING] Thumbnails\thumbnail.png An OpenDocument Package SHOULD contain a preview image Thumbnails/thumbnail.png. PKG-2: [ERROR] styles.xml All files contained in the Zip file shall be non compressed (STORED) or compressed using the "deflate" (DEFLATED) algorithm. Zip entry styles.xml is compressed with an unknown algorithm. POL_9: [ERROR] META-INF\signatures Digital Signatures | The package MUST NOT contain any digital signatures. POL_2: [ERROR] AT029.ods Standard Compliance | Package does not comply with specification. The file MUST comply with the standard "OASIS Open Document Format for Office Applications (OpenDocument) v1.3". POL_5: [INFO] content.xml External data check | Table formula detected. NOT VALID, 17 errors, 1 warnings and 4 info messages.

maria-messerschmidt commented 4 weeks ago

AT031.ods is now validating without crashing, but AT012 is still crashing (although at an earlier stage, cf. #211 ):

APP-1: [INFO] Validating D:\odsfiler\AT031.ods. APP-4: [INFO] Validation report for D:\odsfiler\AT031.ods. DOC-2: [INFO] package OpenDocument version 1.3 detected. PKG-5: [ERROR] META-INF\ny.xml An OpenDocument Package SHALL only contain the "META-INF/manifest.xml" and files containg the term "signatures" in their name in the "META-INF" folder. File META-INF/ny.xml does not meet this criteria. MIM-4: [ERROR] mimetype An OpenDocument Package SHALL contain a mimetype file IF the manifest contains a element whose manifest:full-path attribute has the value "/". MAN-6: [INFO] META-INF\manifest.xml The OpenDocument Package manifest NEED NOT contain entries for file paths starting with META-INF/, META-INF/manifest. MAN-6: [INFO] META-INF\manifest.xml The OpenDocument Package manifest NEED NOT contain entries for file paths starting with META-INF/, META-INF/signatures. MAN-4: [ERROR] META-INF\manifest.xml The manifest SHALL contain an entry for every file in the package, manifest file entry Thumbnails/thumbnail.png has no corresponding zip entry. MAN-3: [ERROR] META-INF\manifest.xml An OpenDocument Package manifest SHALL NOT contain a file entry the mimetype file. MAN-1: [ERROR] extra.xml The manifest SHALL contain an entry for every file in the package, zip entry extra.xml has no corresponding manifest file entry. PKG-7: [WARNING] Thumbnails\thumbnail.png An OpenDocument Package SHOULD contain a preview image Thumbnails/thumbnail.png. PKG-2: [ERROR] styles.xml All files contained in the Zip file shall be non compressed (STORED) or compressed using the "deflate" (DEFLATED) algorithm. Zip entry styles.xml is compressed with an unknown algorithm. NOT VALID, 6 errors, 1 warnings and 3 info messages.

APP-5: [INFO] DNA ODF Spreadsheets Preservation Specification Profile report for D:\odsfiler\AT031.ods. POL_2: [ERROR] AT031.ods Standard Compliance | Package does not comply with specification. The file MUST comply with the standard "OASIS Open Document Format for Office Applications (OpenDocument) v1.3". POL_9: [ERROR] META-INF\signatures Digital Signatures | The package MUST NOT contain any digital signatures. POL_3: [ERROR] mimetype Package mimetype entry | An ODF package MUST have a mimetype entry as specified in the Section 3.3 of the ODF specification v1.3. POL_4: [ERROR] mimetype Extension and MIME type | The MIME type value MUST be: "application/vnd.oasis.opendocument.spreadsheet" and the file extension MUST be ".ods". POL_5: [INFO] content.xml External data check | Table formula detected. NOT VALID, 10 errors, 1 warnings and 4 info messages.

maria-messerschmidt commented 4 weeks ago

AT069.ods is also crashing at this early stage with APP-1. The scenario for this file is that there are several discrepancies in the manifest: it lists files not in the package and fails to list some files that are in the package. AT069.ods

APP-1: [INFO] Validating D:\odsfiler\AT069.ods. java.lang.NullPointerException: Cannot invoke "org.openpreservation.odf.xml.OdfXmlDocument.getParseResult()" because the return value of "org.openpreservation.odf.pkg.OdfPackageDocument.getXmlDocument(String)" is null at org.openpreservation.odf.pkg.OdfPackageImpl.getEntryXmlParseResult(OdfPackageImpl.java:200) at org.openpreservation.odf.validation.ValidatingParserImpl.validateXmlEntry(ValidatingParserImpl.java:122) at org.openpreservation.odf.validation.ValidatingParserImpl.validateOdfXmlEntries(ValidatingParserImpl.java:109) at org.openpreservation.odf.validation.ValidatingParserImpl.validate(ValidatingParserImpl.java:99) at org.openpreservation.odf.validation.ValidatingParserImpl.validatePackage(ValidatingParserImpl.java:72) at org.openpreservation.odf.validation.Validator.validatePackage(Validator.java:109) at org.openpreservation.odf.validation.Validator.validate(Validator.java:88) at org.openpreservation.odf.apps.CliValidator.validatePath(CliValidator.java:74) at org.openpreservation.odf.apps.CliValidator.call(CliValidator.java:54) at org.openpreservation.odf.apps.CliValidator.call(CliValidator.java:34) at picocli.CommandLine.executeUserObject(CommandLine.java:2041) at picocli.CommandLine.access$1500(CommandLine.java:148) at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2461) at picocli.CommandLine$RunLast.handle(CommandLine.java:2453) at picocli.CommandLine$RunLast.handle(CommandLine.java:2415) at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2273) at picocli.CommandLine$RunLast.execute(CommandLine.java:2417) at picocli.CommandLine.execute(CommandLine.java:2170) at org.openpreservation.odf.apps.CliValidator.main(CliValidator.java:98)