E-ARK-Software / eark-validator

E-ARK Python Information Package validation library
Apache License 2.0
5 stars 3 forks source link

Validating CSIP24 #52

Closed dockmd closed 2 months ago

dockmd commented 3 months ago

The validator crashes while testing rule 1, package "IP_18000_CSIP24_1" from testCase.xml

Sunday-Crunk commented 2 months ago

Perfect, nice one

CSIP24 The validator crashes while testing rule 1, package "IP_18000_CSIP24_1" from testCase.xml

Test

eark-validator /eark-test-corpus/eark-ip-test-corpus/corpus/CSIP/CSIP24/invalid/IP_18000_CSIP24_1

Result prepatch

Fail: Validator crashed

Traceback (most recent call last):
  File "/eark-validator/venv/bin/eark-validator", line 8, in <module>
    sys.exit(main())
  File "eark-validator/eark_validator/cli/app.py", line 127, in main
    _loop_exit, _ = _validate_ip(file_arg, args.specification_version)
  File "/eark-validator/eark_validator/cli/app.py", line 135, in _validate_ip
    report = PACKAGES.PackageValidator(checked_path, version).validation_report
  File "/eark-validator/eark_validator/packages.py", line 66, in __init__
    self._report = self.validate(self._version, self._to_proc)
  File "/eark-validator/eark_validator/packages.py", line 96, in validate
    validator.validate_mets(METS)
  File "/eark-validator/eark_validator/mets.py", line 156, in validate_mets
    self._process_element(element)
  File "/eark-validator/eark_validator/mets.py", line 179, in _process_element
    self._file_refs.append(_parse_file_entry(element))
  File "/eark-validator/eark_validator/mets.py", line 192, in _parse_file_entry
    'path': _path_from_xml_element(element),
  File "/eark-validator/eark_validator/mets.py", line 208, in _path_from_xml_element
    return  _get_path_attrib(loc_ele)
  File "/eark-validator/eark_validator/mets.py", line 214, in _get_path_attrib
    return element.attrib[attrib_name]
  File "src/lxml/etree.pyx", line 2502, in lxml.etree._Attrib.__getitem__
KeyError: '{http://www.w3.org/1999/xlink}href'

Fix result

Pass: rule reported correctly.

[
    {
        "rule_id": "CSIP24",
        "severity": "Error",
        "location": {
            "context": "/mets:mets/mets:dmdSec/mets:mdRef",
            "test": "@xlink:href",
            "description": "/*[local-name()='mets' and namespace-uri()='http://www.loc.gov/METS/']/*[local-name()='dmdSec' and namespace-uri()='http://www.loc.gov/METS/']/*[local-name()='mdRef' and namespace-uri()='http://www.loc.gov/METS/']"
        },
        "message": "The actual location of the resource. This specification recommends recording a URL type filepath in this attribute."
    }
]

Test

eark-validator /eark-test-corpus/eark-ip-test-corpus/corpus/CSIP/CSIP24/valid/IP_18000_CSIP24_2

Result prepatch

Pass: No entry for CSIP24

Fix result

Pass: No entry for CSIP24

dockmd commented 2 months ago

Issue solved - https://github.com/E-ARK-Software/eark-validator/pull/48