OCFL / Use-Cases

A repository to help capture, track, and discuss use cases for OCFL. Issues-only, please.
7 stars 0 forks source link

Support file renaming between versions #26

Closed zimeon closed 5 years ago

zimeon commented 6 years ago

It should be possible to have files with the same content but different names in different versions of an OCFL object. In essence this is part of a requirement that the content of one version of an OCFL object should not constrain the content of any other version of an OCFL object (though overlapping content may lead to storage efficiencies).

ahankinson commented 6 years ago

My thoughts, after having tried to work through this:

1.) The idea of an 'immutable version' and 'renaming' are, on the face of it, mutually incompatible. A rename is a change that has no effect on the content, but it does affect the addressability of an object.

2.) We cannot rely on the hash OR the path to identify content; we must rely on both. It is possible to have different file names with the same hash (e.g., empty files, different names) and same file names with different hash. (e.g., an updated file)

3.) Thus we need a file manifest to track both hash and filename identity. However, if we support rename, we also need to support the fact that a file's name can change over time, and provide an indicator of when it changed.

4.) Since the version is immutable, we cannot change the original file name. Thus the only place the rename is recorded is in the inventory, making the inventory crucial to the 'resolution' process: we cannot just go to 'v4' to find the file, we need to go to v1 and run a utility to rename the output file based on the inventory, before handing it to the user.

5.) Otherwise, we do not support true rename, and instead insist that users must re-accession the file with the new name. This can be a potentially expensive option depending on file size OR it may not be a problem depending on whether the underlying storage system supports dedupe e.g., at the hardware level.

6.) We could define a special 'filename.ext.rename' file and put it in the version directory when the rename happened, and give it special semantics. This is a compromise solution that will: a) provide us with a tangible filesystem change (a new file) for the rename action to indicate the change in a given version b) add a small file (potentially problematic for storage systems) c) still require a software utility to enact the change when the object's versions are resolved into a view of the current state of the object.

ahankinson commented 6 years ago

F2F 2018.09.05: Renaming is in scope, but further discussion will be moved to the discussion on the contents of the inventory.

zimeon commented 5 years ago

Supported since v0.1