artefactual-labs / mets-reader-writer

Library to parse and create METS files, especially for Archivematica.
https://mets-reader-writer.readthedocs.io
GNU Affero General Public License v3.0
20 stars 13 forks source link

Problem: API for defining metadata plugins is ill-defined #28

Closed jrwdunham closed 6 years ago

jrwdunham commented 6 years ago

The following PRs introduce PREMIS-related functionality to metsrw:

The Write pointer files PR #27 introduces an informal plugin system that allows metsrw's METSDocument class to know how to work with different metadata standards, in this case PREMIS. The API for this plugin system needs to be analyzed and explicitly defined so that it can be used, e.g., to convert PR #20's PREMIS work to a plugin.

Here is how the metsrw plugin system currently works. The constructor of the METSDocument class now accepts a plugins kwarg which should be a dict mapping mdType values (e.g., 'PREMIS:OBJECT') to classes. Those classes are expected to have a class method fromtree which takes an lxml._Element instance (provided by metsrw' FSEntry) and returns an instance of the plugin's representation of the metadata element, cf. the PREMISRW plugin. This allows one to call METSDocument(plugins=my_plugins).get_file(file_uuid=my_file_uuid).get_premis_objects() and get PREMIS objects as the anticipated type of Python instance.

In the other direction, mets_fs_entry.add_premis_object(premis_object.serialize()) is how a PREMIS metadata element is currently added to an existing metsrw FSEntry instance.

In summary, under the API implicit in PR #27, metadata plugins must be classes that provide:

  1. a fromtree class method that creates an instance given an lxml._Element instance as input, and
  2. a serialize instance method that returns an lxml._Element instance.

This ^ API is definitely not written in stone. But it is a start...

jrwdunham commented 6 years ago

The metsrw commit cf12834a137572b3af47be2e9401ce6ab39c3c26 (part of PR #27) introduces functionality that should facilitate the creation of PREMIS plugins that conform to the API declared by the FSEntry class.

jrwdunham commented 6 years ago

I am closing this because the plugin (dependency injection) system seems to address this. The FSEntry class now declares its dependencies via class methods that valuate to Dependency instances which are passed assertions about the dependency, e.g.:

>>> premis_object_class = Dependency(
...     has_methods('serialize'),
...     has_class_methods('fromtree'),
...     is_class)