OCFL / Use-Cases

A repository to help capture, track, and discuss use cases for OCFL. Issues-only, please.
7 stars 0 forks source link

Flagging file loss/corruption #14

Open ahankinson opened 6 years ago

ahankinson commented 6 years ago

A periodic audit of a filesystem has revealed that a PDF file no longer matches its checksum, and will no longer open in a PDF reader. Checksums should be used to flag this as a problem and alert a validation client that the object is no longer valid.

ahankinson commented 6 years ago

F2F 2018.09.05: Clarification of 'Corruption' -- the first is that the bits changed on disk, i.e. 'bitrot', leading to files not matching their checksum.

The second is format corruption: A file was provided with a correct checksum, but it did not conform to its own spec, e.g., an empty 'zip' file or a badly-encoded JPEG2000. Issue moved to #30.

We should not allow broken invalid OCFL objects to exist. In spec this is a "MUST" that objects must be valid, and checksums should match.

Interventions can include fixing an object to be valid, or, if this is not possible, deleting the object.

ahankinson commented 6 years ago

NB: Changed original description to clarify parts of the Use Case that were determined not be in scope.

neilsjefferies commented 5 years ago

I would say checksum failure is an OCFL concern. File format validation is not - it is too wooly to define what valid is. Try and define what a valid doc file is.

zimeon commented 11 months ago

Editors' meeting 2023-09-22: This might be handled by the same tombstone/flagging mechanism as #42

rosy1280 commented 10 months ago

Feedback on Use Cases

In advance of version 2 of the OCFL, we are soliciting feedback on use cases. Please feel free to add your thoughts on this use case via the comments.

Polling on Use Cases

In addition to reviewing comments, we are doing an informal poll for each use case that has been tagged as Proposed: In Scope for version 2. You can contribute to the poll for this use case by reacting to this comment. The following reactions are supported:

In favor of the use case Against the use case Neutral on the use case
👍🏼 👎🏼 👀

The poll will remain open through the end of February 2024.

bdwheele commented 6 months ago

Does "no longer matching it's checksum" also encompass when the file is simply missing or is that another case?

In the event that a file has changed/gone missing and there isn't a way to repair the on-storage ocfl object (for example a replacement file can't be found), it would be good to be able to acknowledge that the file is broken in the object and make the object correct by indicating that the file should have been there but isn't. That way in the future you can either search for objects which contain this information for remediation.

rosy1280 commented 6 months ago

at the time of this comment the vote tallied to +2 we are marking this as in-scope for version 2 because we suspect it will be related to #42 which is already in scope for version 2 of the specification