OCFL / Use-Cases

A repository to help capture, track, and discuss use cases for OCFL. Issues-only, please.
7 stars 0 forks source link

Application Profiles #50

Open awoods opened 7 months ago

awoods commented 7 months ago

The OCFL is designed to specify the layout, inventory, and versioning of hierarchies of files targeted for long-term preservation.

The OCFL does not take a position on the types of files that are being preserved, how they are interrelated, or any other details that would be required for an application that reads/writes objects in the OCFL storage root to apply the semantics commonly required to manage content and data modeling.

The use case described here is a recognition of this gap between what the OCFL specification provides and what applications implemented over an OCFL Storage Root require.

One of the opportunities afforded by the OCFL is the potential to decouple the preservation storage from the upper-level application that manages the content within the storage. An outcome of this is that it becomes (hypothetically) possible to replace the repository management application with a different application while leaving the OCFL preservation storage in-place. However, the new management application needs to understand the conventions, metadata standards, structure, and semantics of the files within the OCFL object's content directory.

This use case highlights the need for a mechanism to (minimally) provide a hint within an OCFL Storage Root and/or OCFL Objects and/or OCFL Versions to indicate to which application profile the Objects/Versions conform. The "application profile" could be a specification document or (more aspirationally) a machine-actionable configuration.

awoods commented 7 months ago

Feedback on Use Cases

In advance of version 2 of the OCFL, we are soliciting feedback on use cases. Please feel free to add your thoughts on this use case via the comments.

Polling on Use Cases

In addition to reviewing comments, we are doing an informal poll for each use case that has been tagged as Proposed: In Scope for version 2. You can contribute to the poll for this use case by reacting to this comment. The following reactions are supported:

In favor of the use case Against the use case Neutral on the use case
๐Ÿ‘๐Ÿผ ๐Ÿ‘Ž๐Ÿผ ๐Ÿ‘€

The poll will remain open through the end of February 2024.

awoods commented 7 months ago

An example "application profile" can be seen in the Fedora Repository documentation. As a note, the Fedora "application profile" may be more complex than many such profiles as its content model includes RDF representations for each non-RDF content file.

mjordan commented 7 months ago

Not sure if you are familiar with the BagIt Profiles spec but it might be germane here.

mutanthumb commented 6 months ago

Could file retention schedule information be stored here as wells as "access/security" info such as whether files "institution only", or completely "open"? If the files get separated from the system these are very important to know. If this is the wrong place for this please let me know. I've been following this project intermittently from afar.

rosy1280 commented 6 months ago

@mutanthumb based on the information you provide, it seems like the retention schedule and access information is essential to understanding individual objects and might be better stored within the object itself rather than as part of an application profile. Although it could be that an application profile might define what content is contained within an object. The specification is intentionally silent on what constitutes an object other than providing a vague object definition that states an object is a "group of one or more content files and administrative information, that together have a unique identifier." I would say an application profile might define what information you're storing in the object (e.g., files, descriptive metadata, access metadata, retention schedules, etc.) but itself wouldn't store the information.

I'd also be curious if we're using the same definition for retention schedule. If its the definition I'm thinking of, often times the retention schedule suggests content should be destroyed once the schedule says it should no longer be retained in the repository. The specification stresses that an object should be immutable (see section 3.3, specifically the last paragraph). If the retention schedule calls for the destruction of the content, OCFL might not be the best way to store it, although there are community members that have created an extension called mutable head. This extension enables software to create a HEAD version of an object that can be mutated in-place without generating additional versions.

awoods commented 6 months ago

Not sure if you are familiar with the BagIt Profiles spec but it might be germane here.

@mjordan : Absolutely. BagIt Profiles and METS Profiles are both inspirations for the concept of OCFL Application Profiles.

For OCFL Application Profiles, having something machine-readable and actionable would be maximally useful. However, even having a hint or indication that "this OCFL storage root conforms to this (URL) documented profile", would potentially be sufficient for creating- and consuming-applications that are designed to manage multiple profiles of content within OCFL storage roots.