artefactual-labs / mets-reader-writer

Library to parse and create METS files, especially for Archivematica.
https://mets-reader-writer.readthedocs.io
GNU Affero General Public License v3.0
20 stars 13 forks source link

Problem: Characterization tool namespaces in premis:objects prevent serialization to XML #94

Open tw4l opened 2 years ago

tw4l commented 2 years ago

Example code such as the following throws an exception:

for premis_object in fs_entry.get_premis_objects():
        premis_object_xml = premis_object.tostring()

This appears to be due to characterization tool namespaces not being in the namespaces map:

celery-worker_1  | Traceback (most recent call last):
celery-worker_1  |   File "/usr/local/lib/python3.8/dist-packages/celery/app/trace.py", line 412, in trace_task
celery-worker_1  |     R = retval = fun(*args, **kwargs)
celery-worker_1  |   File "/src/AIPscan/celery.py", line 17, in __call__
celery-worker_1  |     return TaskBase.__call__(self, *args, **kwargs)
celery-worker_1  |   File "/src/AIPscan/celery.py", line 17, in __call__
celery-worker_1  |     return TaskBase.__call__(self, *args, **kwargs)
celery-worker_1  |   File "/usr/local/lib/python3.8/dist-packages/celery/app/trace.py", line 704, in __protected_call__
celery-worker_1  |     return self.run(*args, **kwargs)
celery-worker_1  |   File "/src/AIPscan/Aggregator/tasks.py", line 353, in get_mets
celery-worker_1  |     database_helpers.process_aip_data(aip, mets)
celery-worker_1  |   File "/src/AIPscan/Aggregator/database_helpers.py", line 397, in process_aip_data
celery-worker_1  |     create_file_object(FileType.original, file_, aip.id)
celery-worker_1  |   File "/src/AIPscan/Aggregator/database_helpers.py", line 365, in create_file_object
celery-worker_1  |     _add_characteristics_extension(fs_entry, new_file.id)
celery-worker_1  |   File "/src/AIPscan/Aggregator/database_helpers.py", line 314, in _add_characteristics_extension
celery-worker_1  |     file_.characteristics_extension = premis_object.tostring()
celery-worker_1  |   File "/usr/local/lib/python3.8/dist-packages/metsrw/plugins/premisrw/premis.py", line 139, in tostring
celery-worker_1  |     self.serialize(), pretty_print=pretty_print, encoding=encoding
celery-worker_1  |   File "/usr/local/lib/python3.8/dist-packages/metsrw/plugins/premisrw/premis.py", line 135, in serialize
celery-worker_1  |     return data_to_premis(self._data, self.premis_version)
celery-worker_1  |   File "/usr/local/lib/python3.8/dist-packages/metsrw/plugins/premisrw/premis.py", line 722, in data_to_premis
celery-worker_1  |     return _data_to_lxml_el(data, "premis", nsmap)
celery-worker_1  |   File "/usr/local/lib/python3.8/dist-packages/metsrw/plugins/premisrw/premis.py", line 608, in _data_to_lxml_el
celery-worker_1  |     _data_to_lxml_el(
celery-worker_1  |   File "/usr/local/lib/python3.8/dist-packages/metsrw/plugins/premisrw/premis.py", line 608, in _data_to_lxml_el
celery-worker_1  |     _data_to_lxml_el(
celery-worker_1  |   File "/usr/local/lib/python3.8/dist-packages/metsrw/plugins/premisrw/premis.py", line 608, in _data_to_lxml_el
celery-worker_1  |     _data_to_lxml_el(
celery-worker_1  |   [Previous line repeated 3 more times]
celery-worker_1  |   File "/usr/local/lib/python3.8/dist-packages/metsrw/plugins/premisrw/premis.py", line 620, in _data_to_lxml_el
celery-worker_1  |     ret = func(*args)
celery-worker_1  |   File "src/lxml/builder.py", line 208, in lxml.builder.ElementMaker.__call__
celery-worker_1  |   File "src/lxml/etree.pyx", line 3022, in lxml.etree.Element
celery-worker_1  |   File "src/lxml/apihelpers.pxi", line 101, in lxml.etree._makeElement
celery-worker_1  |   File "src/lxml/apihelpers.pxi", line 1734, in lxml.etree._tagValidOrRaise
celery-worker_1  | ValueError: Invalid tag name 'http://hul.harvard.edu/ois/xml/ns/fits/fitsOutput:tool'
mcantelon commented 1 year ago

Running into an issue serializing EXIF tool output:


  File "/home/mike/repos/TEST/mt.py", line 11, in <module>
    premis_object.serialize()
  File "/home/mike/.local/lib/python3.10/site-packages/metsrw/plugins/premisrw/premis.py", line 135, in serialize
    return data_to_premis(self._data, self.premis_version)
  File "/home/mike/.local/lib/python3.10/site-packages/metsrw/plugins/premisrw/premis.py", line 722, in data_to_premis
    return _data_to_lxml_el(data, "premis", nsmap)
  File "/home/mike/.local/lib/python3.10/site-packages/metsrw/plugins/premisrw/premis.py", line 608, in _data_to_lxml_el
    _data_to_lxml_el(
  File "/home/mike/.local/lib/python3.10/site-packages/metsrw/plugins/premisrw/premis.py", line 608, in _data_to_lxml_el
    _data_to_lxml_el(
  File "/home/mike/.local/lib/python3.10/site-packages/metsrw/plugins/premisrw/premis.py", line 608, in _data_to_lxml_el
    _data_to_lxml_el(
  [Previous line repeated 2 more times]
  File "/home/mike/.local/lib/python3.10/site-packages/metsrw/plugins/premisrw/premis.py", line 620, in _data_to_lxml_el
    ret = func(*args)
  File "src/lxml/builder.py", line 208, in lxml.builder.ElementMaker.__call__
  File "src/lxml/etree.pyx", line 3022, in lxml.etree.Element
  File "src/lxml/apihelpers.pxi", line 101, in lxml.etree._makeElement
  File "src/lxml/apihelpers.pxi", line 1734, in lxml.etree._tagValidOrRaise
ValueError: Invalid tag name 'http://ns.exiftool.ca/exiftool/1.0/:exifToolVersion'```