sul-dlss / preservation2017

Story repo for preservation core work done summer/fall 2017
0 stars 0 forks source link

Story: Create a Provenance metadata store #4

Open LynnMcRae opened 6 years ago

LynnMcRae commented 6 years ago

This is an operational metadata store, part of the overall Preservation Core Catalog. It contains a datetime-stamped record of events/actions relating to the management of an object in preservation code -- from deposit/ingest of each version, through each replication, every audit check and finding or recovery action taken. As the name implies, this constitutes the object's provenance.

This is described as a component of the Preservation Core Catalog, or PCC. It may make sense as a third database table, or it may exist in some other form, e.g., a structured text file. It is required that this information be maintained per object as a readily and efficiently accessible set of information so that it can be displayed only via Argo or easily attached to objects for preservation or dissemination.

Desired data is a basic set of when, what, who and description, e.g., "Fixit check passed" or "Object X replaced in Local Archive".

Operationally it constitutes an object's history and should be accessible via a Web Service call to authorized users, including Argo for incorporation into its object detail display.

This story does NOT cover related forms of provenance integration or disposition. For instance: there is a provenanceMetadata datastream in objects coming from DOR which contain a small set of event information relating to DOR's management of the objects. In principle it is part of a larger event sequence that covers both DOR and PC custodianship of the object over time. How these are combined for any future export of the object and its complete provenance is not in scope here.

Note the XML schema used for DOR's provenance at https://consul.stanford.edu/display/chimera/Provenance+metadata+--+the+provenanceMetadata+datastream.

If we look just at the event-level information there, we see:

This may suffice for us here, without trying to categorize of codify entries.

Candidate entries (making up text for illustration, could use some crafting):