E-ARK-Software / eark-validator

E-ARK Python Information Package validation library
Apache License 2.0
5 stars 3 forks source link

Incorrect validation of requirement CSIP32 #109

Closed dockmd closed 2 weeks ago

dockmd commented 2 weeks ago

Test case: https://github.com/DILCISBoard/eark-ip-test-corpus/tree/integration/corpus/CSIP/CSIP32/testCase.xml contains the definition of 1 packages which should be invalid but the validator says they are valid. Valid according to the validator, but should be invalid: Package: https://github.com/DILCISBoard/eark-ip-test-corpus/tree/integration/corpus/CSIP/CSIP32/valid/IP_18000_CSIP32_2 Output: struct result is: WellFormed {"uid":"fea1742940ce48e1855e381b9d29d524","structure":{"status":"WellFormed","messages":[{"rule_id":"CSIPSTR3","severity":"Info","location":"root IP_18000_CSIP32_2","message":"The Information Package MAY be contained in an archive/compressed form, e.g. TAR or ZIP, for storage or transfer. The specific format details should be decided by the interested parties and documented, for example in a submission agreement or statement of access terms."},{"rule_id":"CSIPSTR6","severity":"Warn","location":"root IP_18000_CSIP32_2","message":"If preservation metadata are available they SHOULD be included in sub-folder preservation."},{"rule_id":"CSIPSTR7","severity":"Warn","location":"root IP_18000_CSIP32_2","message":"If descriptive metadata are available, they SHOULD be included in sub-folder descriptive."},{"rule_id":"CSIPSTR8","severity":"Info","location":"root IP_18000_CSIP32_2","message":"If any other metadata are available, they MAY be included in separate sub-folders, for example an additional folder named other."},{"rule_id":"CSIPSTR16","severity":"Warn","location":"root IP_18000_CSIP32_2","message":"We recommend including any supplementary documentation for the package or a specific representation within the package. Supplementary documentation SHOULD be placed in a sub-folder called documentation within the Information Package root folder and/or the representation folder. Examples of documentation include representation information and manuals for the system the data objects have been exported from."},{"rule_id":"CSIPSTR12","severity":"Warn","location":"rep1 representation","message":"The representation folder SHOULD include a metadata file named METS.xml which includes information about the identity and structure of the representation and its components. The recommended best practice is to always have a METS.xml in the representation folder."},{"rule_id":"CSIPSTR13","severity":"Warn","location":"rep1 representation","message":"The representation folder SHOULD include a sub-folder named metadata which MAY include all metadata about the specific representation."}]},"metadata":{"schema_results":{"status":"VALID","messages":[]},"schematron_results":{"status":"INVALID","messages":[{"rule_id":"CSIP17","severity":"Warn","location":"/mets:metsmets:dmdSec/[local-name()='mets' and namespace-uri()='http://www.loc.gov/METS/']","message":"Must be used if descriptive metadata about the package content is available. NOTE: According to official METS documentation each metadata section must describe one and only one set of metadata. As such, if implementers want to include multiple occurrences of descriptive metadata into the package this must be done by repeating the whole dmdSec element for each individual metadata."},{"rule_id":"CSIP8","severity":"Warn","location":"/mets:mets/mets:metsHdr@LASTMODDATE/[local-name()='mets' and namespace-uri()='http://www.loc.gov/METS/']/*[local-name()='metsHdr' and namespace-uri()='http://www.loc.gov/METS/']","message":"The metsHdr element SHOULD have a LASTMODDATE attribute."},{"rule_id":"CSIP34","severity":"Warn","location":"/mets:mets/mets:amdSec/mets:digiprovMD@STATUS/[local-name()='mets' and namespace-uri()='http://www.loc.gov/METS/']/[local-name()='amdSec' and namespace-uri()='http://www.loc.gov/METS/']/*[local-name()='digiprovMD' and namespace-uri()='http://www.loc.gov/METS/']","message":"SHOULD be used to indicated the status of the package."},{"rule_id":"CSIP35","severity":"Warn","location":"/mets:mets/mets:amdSec/mets:digiprovMDmets:mdRef/[local-name()='mets' and namespace-uri()='http://www.loc.gov/METS/']/[local-name()='amdSec' and namespace-uri()='http://www.loc.gov/METS/']/*[local-name()='digiprovMD' and namespace-uri()='http://www.loc.gov/METS/']","message":"Should provide a reference to the digital provenance metadata file stored in the “metadata” section of the IP."},{"rule_id":"CSIP60","severity":"Error","location":"/mets:mets/mets:fileSecmets:fileGrp[@USE = 'Documentation']/[local-name()='mets' and namespace-uri()='http://www.loc.gov/METS/']/[local-name()='fileSec' and namespace-uri()='http://www.loc.gov/METS/']","message":"All documentation pertaining to the transferred content is placed in one or more file group elements with mets/fileSec/fileGrp/@USE attribute value “Documentation”."},{"rule_id":"CSIP114","severity":"Error","location":"/mets:mets/mets:fileSecmets:fileGrp[starts-with(@USE, 'Representations')]/[local-name()='mets' and namespace-uri()='http://www.loc.gov/METS/']/[local-name()='fileSec' and namespace-uri()='http://www.loc.gov/METS/']","message":"A pointer to the METS document describing the representation or pointers to the content being transferred must be present in one or more file groups with mets/fileSec/fileGrp/@USE attribute value starting with \"Representations\" followed by the path to the folder where the representation level METS document is placed. For example \"Representation/submission\" and \"Representation/ingest\"."},{"rule_id":"CSIP63","severity":"Error","location":"/mets:mets/mets:fileSec/mets:fileGrp(@csip:CONTENTINFORMATIONTYPE = 'OTHER' and @csip:OTHERCONTENTINFORMATIONTYPE) or (@csip:CONTENTINFORMATIONTYPE != 'OTHER' and not(@csip:CONTENTINFORMATIONTYPE))/[local-name()='mets' and namespace-uri()='http://www.loc.gov/METS/']/[local-name()='fileSec' and namespace-uri()='http://www.loc.gov/METS/']/*[local-name()='fileGrp' and namespace-uri()='http://www.loc.gov/METS/']","message":"When the mets/fileSec/fileGrp/@csip:CONTENTINFORMATIONTYPE attribute has the value \"OTHER\" the mets/fileSec/fileGrp/@csip:OTHERCONTENTINFORMATIONTYPE must state a value for the Content Information Type Specification used."},{"rule_id":"CSIP82","severity":"Error","location":"/mets:mets/mets:structMap@LABEL = 'CSIP'/[local-name()='mets' and namespace-uri()='http://www.loc.gov/METS/']/[local-name()='structMap' and namespace-uri()='http://www.loc.gov/METS/']","message":"The mets/structMap/@LABEL attribute value is set to “CSIP” from the vocabulary."},{"rule_id":"SIP2","severity":"Error","location":"/mets:mets@PROFILE = 'https://earksip.dilcis.eu/profile/E-ARK-SIP.xml'/*[local-name()='mets' and namespace-uri()='http://www.loc.gov/METS/']","message":"The PROFILE attribute MUST contain the URL of the METS profile, for a SIP: https://earksip.dilcis.eu/profile/E-ARK-SIP.xml."},{"rule_id":"SIP14","severity":"Error","location":"/mets:mets/mets:metsHdr/mets:agent[@ROLE = 'CREATOR']/mets:note@NOTETYPE = 'IDENTIFICATIONCODE'/[local-name()='mets' and namespace-uri()='http://www.loc.gov/METS/']/[local-name()='metsHdr' and namespace-uri()='http://www.loc.gov/METS/']/*[local-name()='agent' and namespace-uri()='http://www.loc.gov/METS/']/*[local-name()='note' and namespace-uri()='http://www.loc.gov/METS/']","message":"The creator agent element MUST have a NOTETYPE attribute of value IDENTIFICATIONCODE."}]}},"package":{"mets":{"root":{"namespaces":{"xsi":"http://www.w3.org/2001/XMLSchema-instance","":"http://www.loc.gov/METS/","xlink":"http://www.w3.org/1999/xlink","csip":"https://DILCIS.eu/XML/METS/CSIPExtensionMETS"},"objid":"IP_18000_CSIP32_2","label":"","type":"Databases","profile":"http://www.eark-project.com/METS/IP.xml"},"file_entries":[{"path":"schemas/mets.xsd","type":"file","size":"133920","checksum":{"algorithm":"MD5","value":"4E9961DEC3DE72081E6142B28A437FB8"},"mimetype":"application/xml","isValid":true,"errors":[]},{"path":"schemas/XMLSchema.xsd","type":"file","size":"87677","checksum":{"algorithm":"MD5","value":"94ED1A93CE3147D01BCB2FC1126255ED"},"mimetype":"application/xml","isValid":true,"errors":[]},{"path":"schemas/xlink.xsd","type":"file","size":"8052","checksum":{"algorithm":"MD5","value":"14DAC48802F5F99C51A6B200F9A0B3B4"},"mimetype":"application/xml","isValid":true,"errors":[]},{"path":"schemas/CSIPExtensionMETS.xsd","type":"file","size":"1673","checksum":{"algorithm":"MD5","value":"1A31B3AA3AE1E9B99E7A8B4618F3B485"},"mimetype":"application/xml","isValid":true,"errors":[]}]},"details":{"name":"IP_18000_CSIP32_2","label":"","oaispackagetype":"SIP","othertype":"","contentinformationtype":"SIARD2","checksums":[]},"representations":[]}}