OCFL / Use-Cases

A repository to help capture, track, and discuss use cases for OCFL. Issues-only, please.
7 stars 0 forks source link

Collapsing OCFL Object Versions #46

Open awoods opened 11 months ago

awoods commented 11 months ago

In the case where versions of an OCFL object have been created that are not considered curatorially significant, there are times where it would be useful to have OCFL support in collapsing those versions. For example in the object below, if the versions 3-6 are considered to be one curatorially significant version of the object...

[object root]
    ├── 0=ocfl_object_1.1
    ├── inventory.json
    ├── inventory.json.sha512
    ├── v1
    │   ├── inventory.json
    │   └── ...
    ├── v2
    │   ├── inventory.json
    │   └── ...
    ├── v3
    │   ├── inventory.json
    │   └── ...
    ├── v4
    │   ├── inventory.json
    │   └── ...
    ├── v5
    │   ├── inventory.json
    │   └── ...
    ├── v6
    │   ├── inventory.json
    │   └── ...
    ├── v7
    │   ├── inventory.json
    │   └── ...
    └── v8
        ├── inventory.json
        └── ...

It would be helpful to be able to collapse those versions, such as:

[object root]
    ├── 0=ocfl_object_1.1
    ├── inventory.json
    ├── inventory.json.sha512
    ├── v1
    │   ├── inventory.json
    │   └── ...
    ├── v2
    │   ├── inventory.json
    │   └── ...
    ├── v3 (contains collapsed result of previous versions 3-6)
    │   ├── inventory.json
    │   └── ...
    ├── v4 (previous v7)
    │   ├── inventory.json
    │   └── ...
    └── v5 (previous v8)
        ├── inventory.json
        └── ...
rosy1280 commented 10 months ago

Feedback on Use Cases

In advance of version 2 of the OCFL, we are soliciting feedback on use cases. Please feel free to add your thoughts on this use case via the comments.

Polling on Use Cases

In addition to reviewing comments, we are doing an informal poll for each use case that has been tagged as Proposed: In Scope for version 2. You can contribute to the poll for this use case by reacting to this comment. The following reactions are supported:

In favor of the use case Against the use case Neutral on the use case
👍🏼 👎🏼 👀

The poll will remain open through the end of February 2024.

srerickson commented 10 months ago

This use case describes a type of transformation to an OCFL object. My understanding is that the OCFL spec doesn't really concern itself with transformations, but with "objects at rest". I don't see why collapsing versions can't be implemented if needed under v1 (e.g., through a re-ingest process, similar to what's required for purging content) or how v2 might be designed to make this easier.

awoods commented 10 months ago

Agreed. This use case may inform implementation notes and (ideally) implementation in OCFL libraries.

slabrams commented 10 months ago

I strongly endorse this use case. The design of our repository system predates OCFL. As a result, routine object editing operations can result is a large number of granular, but curatorially-meaningless versions. This needlessly complicates the object's internal structure, increases inventory sizes, and - well - is just an inelegant outcome. Even if the solution is just a formal statement of implementation best practice for collapsing versions, that is important for purposes of getting such practice integrated into well-known OCFL tools and library packages. It is useful if we can all try to converge on doing the same things the same way.

pwinckles commented 10 months ago

Prior relevant discussion: https://github.com/OCFL/spec/issues/367

The implementation notes currently discourage this:

Previous versions of an object should be considered immutable since the composition of later versions of an object may be dependent on them. In addition, the assumption of immutability ensures that copies of different versions of an object remain consistent with each other, avoiding issues with identifying canonicity and reconciliation.

https://ocfl.io/1.1/implementation-notes/#version-immutability

scossu commented 7 months ago

I support this use case too but I am concerned about the change in the following version names. Is that necessary to keep a mandatory gap-less sequence?

In that case, this may reveal a weakness in OCFL's reliance on version name semantics to build the object's history. An explicit chain of links in a linked list style would allow for more flexibility (e.g. one may want name a version by UUID, or checksum), and in this case, wouldn't require changing the location of versions not affected by the change.

Has this been discussed elsewhere?

scossu commented 7 months ago

To @pwinckles 's point, if instead of entirely deleting the version folder, a tombstone is left with a reference to the closest valid version, a consumer could be both notified of the deletion and be redirected to a useful resource (if not exactly the same, but as we know, in the real world we do delete things and OCFL should acknowledge that).

rosy1280 commented 6 months ago

at the time of this comment the vote tallied to +1. we are marking this as in-scope for version 2 because we suspect it will be related to #42 which is already in scope for version 2 of the specification