OCFL / Use-Cases

A repository to help capture, track, and discuss use cases for OCFL. Issues-only, please.
7 stars 0 forks source link

Use external identifiers in place of digests to act as keys within the manifest #47

Closed zimeon closed 6 months ago

zimeon commented 10 months ago

Some preservation systems assign globally unique ids to content files. These could be used as keys within and OCFL Object manifest to link metadata, state and fixity information together, in place of the current use of a digest. Fixity would then only be in the fixity section which would then be mandatory.

rosy1280 commented 10 months ago

Feedback on Use Cases

In advance of version 2 of the OCFL, we are soliciting feedback on use cases. Please feel free to add your thoughts on this use case via the comments.

Polling on Use Cases

In addition to reviewing comments, we are doing an informal poll for each use case that has been tagged as Proposed: In Scope for version 2. You can contribute to the poll for this use case by reacting to this comment. The following reactions are supported:

In favor of the use case Against the use case Neutral on the use case
👍🏼 👎🏼 👀

The poll will remain open through the end of February 2024.

srerickson commented 10 months ago

The utility of manifest keys (in OCFL v1 ) is to uniquely identify content in a way that is independent of the underlying storage architecture. Under this proposal, the key would become an opaque unique identifier for linking "metadata, state and fixity information together". The key may originate in the external storage system, but an OCFL implementation can't really take advantage of that without adding the storage system as a dependency (i.e., for the purposes of validation), which seems undesirable. In addition, using identifiers to link different components of the inventory adds to the inventory size. A better approach might be structure the inventory.json in a way that makes these linkages unnecessary (here is an example).

Perhaps a concrete example would help address these concerns.

je4 commented 10 months ago

External identifiers could interfere with the idea of deduplication. There could be different identifiers for files with identical content. Therefore, the external identifier is not linked to the "content of the file", but to the "name of the file".

rosy1280 commented 6 months ago

At the time of this comment the votes tallied to -2.

The recommendation from editors is if the need to record external identifiers is necessary, it should be implemented as an extension.