E-ARK-Software / eark-validator

E-ARK Python Information Package validation library
Apache License 2.0
5 stars 3 forks source link

Incorrect validation of requirement CSIP11 #78

Open dockmd opened 3 weeks ago

dockmd commented 3 weeks ago

Test case: https://github.com/DILCISBoard/eark-ip-test-corpus/tree/integration/corpus/CSIP/CSIP11/testCase.xml contains the definition of 1 packages which should be invalid but the validator says they are valid. Valid according to the validator, but should be invalid: Package: https://github.com/DILCISBoard/eark-ip-test-corpus/tree/integration/corpus/CSIP/CSIP11/invalid/mets-xml_metsHdr_agent_all_criterias_different_objs Output: struct result is: WellFormed {"uid":"5a4722bb17d843128d6b34ee80ffd004","structure":{"status":"WellFormed","messages":[{"rule_id":"CSIPSTR3","severity":"Info","location":"root mets-xml_metsHdr_agent_all_criterias_different_objs","message":"The Information Package MAY be contained in an archive/compressed form, e.g. TAR or ZIP, for storage or transfer. The specific format details should be decided by the interested parties and documented, for example in a submission agreement or statement of access terms."},{"rule_id":"CSIPSTR5","severity":"Warn","location":"root mets-xml_metsHdr_agent_all_criterias_different_objs","message":"The Information Package root folder SHOULD include a folder named metadata, which SHOULD include metadata relevant to the whole package."},{"rule_id":"CSIPSTR12","severity":"Warn","location":"rep1 representation","message":"The representation folder SHOULD include a metadata file named METS.xml which includes information about the identity and structure of the representation and its components. The recommended best practice is to always have a METS.xml in the representation folder."},{"rule_id":"CSIPSTR13","severity":"Warn","location":"rep1 representation","message":"The representation folder SHOULD include a sub-folder named metadata which MAY include all metadata about the specific representation."}]},"metadata":{"schema_results":{"status":"VALID","messages":[]},"schematron_results":{"status":"INVALID","messages":[{"rule_id":"CSIP4","severity":"Error","location":"/mets:mets((@csip:CONTENTINFORMATIONTYPE = 'ERMS') or (@csip:CONTENTINFORMATIONTYPE = 'SIARD1') or (@csip:CONTENTINFORMATIONTYPE = 'SIARD2') or (@csip:CONTENTINFORMATIONTYPE = 'SIARDDK') or (@csip:CONTENTINFORMATIONTYPE = 'GeoData') or (@csip:CONTENTINFORMATIONTYPE = 'citscarchival_v1_0') or (@csip:CONTENTINFORMATIONTYPE = 'cscarchival_v1_0') or (@csip:CONTENTINFORMATIONTYPE = 'citserms_v2_1') or (@csip:CONTENTINFORMATIONTYPE = 'citserms_v3_0') or (@csip:CONTENTINFORMATIONTYPE = 'citspremis_v1_0') or (@csip:CONTENTINFORMATIONTYPE = 'cspremis_v1_0') or (@csip:CONTENTINFORMATIONTYPE = 'citsehpj_v1_0') or (@csip:CONTENTINFORMATIONTYPE = 'citsehpj_v2_0') or (@csip:CONTENTINFORMATIONTYPE = 'citsehcr_v1_0') or (@csip:CONTENTINFORMATIONTYPE = 'citssiard_v1_0') or (@csip:CONTENTINFORMATIONTYPE = 'citsgeospatial_v3_0') or (@csip:CONTENTINFORMATIONTYPE = 'MIXED') or (@csip:CONTENTINFORMATIONTYPE = 'OTHER')) and (@csip:CONTENTINFORMATIONTYPE != 'OTHER' or (@csip:CONTENTINFORMATIONTYPE = 'OTHER' and @csip:OTHERCONTENTINFORMATIONTYPE != ''))/[local-name()='mets' and namespace-uri()='http://www.loc.gov/METS/']","message":"Used to declare the Content Information Type Specification used when creating the package. Legal values are defined in a fixed vocabulary. The attribute is mandatory for representation level METS documents."},{"rule_id":"CSIP5","severity":"Error","location":"/mets:mets(@csip:CONTENTINFORMATIONTYPE = 'OTHER' and @csip:OTHERCONTENTINFORMATIONTYPE) or @csip:CONTENTINFORMATIONTYPE != 'OTHER'/[local-name()='mets' and namespace-uri()='http://www.loc.gov/METS/']","message":"When the mets/@csip:CONTENTINFORMATIONTYPE has the value “OTHER” the mets/@csip:OTHERCONTENTINFORMATIONTYPE must state the content information type."},{"rule_id":"CSIP17","severity":"Warn","location":"/mets:metsmets:dmdSec/[local-name()='mets' and namespace-uri()='http://www.loc.gov/METS/']","message":"Must be used if descriptive metadata about the package content is available. NOTE: According to official METS documentation each metadata section must describe one and only one set of metadata. As such, if implementers want to include multiple occurrences of descriptive metadata into the package this must be done by repeating the whole dmdSec element for each individual metadata."},{"rule_id":"CSIP31","severity":"Warn","location":"/mets:metsmets:amdSec/[local-name()='mets' and namespace-uri()='http://www.loc.gov/METS/']","message":"If administrative / preservation metadata is available, it must be described using the administrative metadata section (amdSec) element. All administrative metadata is present in a single amdSec element."},{"rule_id":"CSIP8","severity":"Warn","location":"/mets:mets/mets:metsHdr@LASTMODDATE/[local-name()='mets' and namespace-uri()='http://www.loc.gov/METS/']/[local-name()='metsHdr' and namespace-uri()='http://www.loc.gov/METS/']","message":"The metsHdr element SHOULD have a LASTMODDATE attribute."},{"rule_id":"CSIP12","severity":"Error","location":"/mets:mets/mets:metsHdr/mets:agent[@ROLE = 'CREATOR']@TYPE = 'OTHER'/[local-name()='mets' and namespace-uri()='http://www.loc.gov/METS/']/[local-name()='metsHdr' and namespace-uri()='http://www.loc.gov/METS/']/*[local-name()='agent' and namespace-uri()='http://www.loc.gov/METS/'][1]","message":"The agent element MUST have a TYPE attribute with the value \"OTHER\"."},{"rule_id":"CSIP63","severity":"Error","location":"/mets:mets/mets:fileSec/mets:fileGrp(@csip:CONTENTINFORMATIONTYPE = 'OTHER' and @csip:OTHERCONTENTINFORMATIONTYPE) or (@csip:CONTENTINFORMATIONTYPE != 'OTHER' and not(@csip:CONTENTINFORMATIONTYPE))/[local-name()='mets' and namespace-uri()='http://www.loc.gov/METS/']/[local-name()='fileSec' and namespace-uri()='http://www.loc.gov/METS/']/*[local-name()='fileGrp' and namespace-uri()='http://www.loc.gov/METS/'][1]","message":"When the mets/fileSec/fileGrp/@csip:CONTENTINFORMATIONTYPE attribute has the value \"OTHER\" the mets/fileSec/fileGrp/@csip:OTHERCONTENTINFORMATIONTYPE must state a value for the Content Information Type Specification used."},{"rule_id":"CSIP63","severity":"Error","location":"/mets:mets/mets:fileSec/mets:fileGrp(@csip:CONTENTINFORMATIONTYPE = 'OTHER' and @csip:OTHERCONTENTINFORMATIONTYPE) or (@csip:CONTENTINFORMATIONTYPE != 'OTHER' and not(@csip:CONTENTINFORMATIONTYPE))/[local-name()='mets' and namespace-uri()='http://www.loc.gov/METS/']/[local-name()='fileSec' and namespace-uri()='http://www.loc.gov/METS/']/*[local-name()='fileGrp' and namespace-uri()='http://www.loc.gov/METS/'][2]","message":"When the mets/fileSec/fileGrp/@csip:CONTENTINFORMATIONTYPE attribute has the value \"OTHER\" the mets/fileSec/fileGrp/@csip:OTHERCONTENTINFORMATIONTYPE must state a value for the Content Information Type Specification used."},{"rule_id":"CSIP63","severity":"Error","location":"/mets:mets/mets:fileSec/mets:fileGrp(@csip:CONTENTINFORMATIONTYPE = 'OTHER' and @csip:OTHERCONTENTINFORMATIONTYPE) or (@csip:CONTENTINFORMATIONTYPE != 'OTHER' and not(@csip:CONTENTINFORMATIONTYPE))/[local-name()='mets' and namespace-uri()='http://www.loc.gov/METS/']/[local-name()='fileSec' and namespace-uri()='http://www.loc.gov/METS/']/*[local-name()='fileGrp' and namespace-uri()='http://www.loc.gov/METS/'][3]","message":"When the mets/fileSec/fileGrp/@csip:CONTENTINFORMATIONTYPE attribute has the value \"OTHER\" the mets/fileSec/fileGrp/@csip:OTHERCONTENTINFORMATIONTYPE must state a value for the Content Information Type Specification used."},{"rule_id":"CSIP105","severity":"Warn","location":"/mets:mets/mets:structMap[@LABEL = 'CSIP']/mets:divmets:div[@LABEL = 'Representations']/mets:div/[local-name()='mets' and namespace-uri()='http://www.loc.gov/METS/']/[local-name()='structMap' and namespace-uri()='http://www.loc.gov/METS/']/*[local-name()='div' and namespace-uri()='http://www.loc.gov/METS/']","message":"When a package consists of multiple representations, each described by a representation level METS.xml document, there should be a discrete representation div element for each representation."},{"rule_id":"CSIP91","severity":"Warn","location":"/mets:mets/mets:structMap[@LABEL = 'CSIP']/mets:div/mets:div[@LABEL = 'Metadata']@ADMID/[local-name()='mets' and namespace-uri()='http://www.loc.gov/METS/']/[local-name()='structMap' and namespace-uri()='http://www.loc.gov/METS/']/*[local-name()='div' and namespace-uri()='http://www.loc.gov/METS/']/*[local-name()='div' and namespace-uri()='http://www.loc.gov/METS/'][1]","message":"The admimistrative metadata division should reference all current administrative metadata sections."},{"rule_id":"CSIP92","severity":"Warn","location":"/mets:mets/mets:structMap[@LABEL = 'CSIP']/mets:div/mets:div[@LABEL = 'Metadata']@DMDID/[local-name()='mets' and namespace-uri()='http://www.loc.gov/METS/']/[local-name()='structMap' and namespace-uri()='http://www.loc.gov/METS/']/*[local-name()='div' and namespace-uri()='http://www.loc.gov/METS/']/*[local-name()='div' and namespace-uri()='http://www.loc.gov/METS/'][1]","message":"The descriptive metadata division should reference all current descriptive metadata sections."},{"rule_id":"SIP2","severity":"Error","location":"/mets:mets@PROFILE = 'https://earksip.dilcis.eu/profile/E-ARK-SIP.xml'/*[local-name()='mets' and namespace-uri()='http://www.loc.gov/METS/']","message":"The PROFILE attribute MUST contain the URL of the METS profile, for a SIP: https://earksip.dilcis.eu/profile/E-ARK-SIP.xml."},{"rule_id":"SIP14","severity":"Error","location":"/mets:mets/mets:metsHdr/mets:agent[@ROLE = 'CREATOR']/mets:note@NOTETYPE = 'IDENTIFICATIONCODE'/[local-name()='mets' and namespace-uri()='http://www.loc.gov/METS/']/[local-name()='metsHdr' and namespace-uri()='http://www.loc.gov/METS/']/*[local-name()='agent' and namespace-uri()='http://www.loc.gov/METS/'][1]/*[local-name()='note' and namespace-uri()='http://www.loc.gov/METS/']","message":"The creator agent element MUST have a NOTETYPE attribute of value IDENTIFICATIONCODE."}]}},"package":{"mets":{"root":{"namespaces":{"":"http://www.loc.gov/METS/","csip":"https://DILCIS.eu/XML/METS/CSIPExtensionMETS","xsi":"http://www.w3.org/2001/XMLSchema-instance","xlink":"http://www.w3.org/1999/xlink"},"objid":"mets-xml_metsHdr_agent_all_criterias_different_objs","label":"","type":"Mixed","profile":"https://earkcsip.dilcis.eu/profile/E-ARK-CSIP.xml"},"file_entries":[{"path":"documentation/Doc1.txt","type":"file","size":"40","checksum":{"algorithm":"MD5","value":"F57DBBDDF87F18043C2029D978749318"},"mimetype":"text/plain","isValid":true,"errors":[]},{"path":"schemas/DILCISExtensionMETS.xsd","type":"file","size":"1633","checksum":{"algorithm":"MD5","value":"E99C19B9CA1271C1D9BAFED19C4BD50A"},"mimetype":"application/xml","isValid":true,"errors":[]},{"path":"schemas/METS.xsd","type":"file","size":"136472","checksum":{"algorithm":"MD5","value":"D303B7A71BA2B4FF0061BDCBA0F152E0"},"mimetype":"application/xml","isValid":true,"errors":[]},{"path":"schemas/xlink.xsd","type":"file","size":"3180","checksum":{"algorithm":"MD5","value":"6BDC7F9459A502964F889D70A335CECE"},"mimetype":"application/xml","isValid":true,"errors":[]},{"path":"representations/rep1/data/plain_text_document.txt","type":"file","size":"12","checksum":{"algorithm":"MD5","value":"A9308BDE501CFD1D91CE4E5E861C8971"},"mimetype":"text/plain","isValid":true,"errors":[]}]},"details":{"name":"mets-xml_metsHdr_agent_all_criterias_different_objs","label":"","oaispackagetype":"SIP","othertype":"","contentinformationtype":"","checksums":[]},"representations":[]}}