OCFL / spec

The Oxford Common File Layout (OCFL) specifications
https://ocfl.io
52 stars 14 forks source link

Making time between object validity constant and small on S3 #605

Closed srerickson closed 1 year ago

srerickson commented 2 years ago

One of the pain-points when updating OCFL objects on S3 is that there is no constant-time move/rename operation (as there is on a filesystem) to use when moving content files into place. As a result, objects are invalid for however much time it takes to upload new content files and/or copy them between keys on S3. Ideally, the time between object validity would be constant and not vary with the number/size of content files. This would make it much easier to implement object locking for version updates (e.g., lock expiration times could be used with more confidence).

One way to make the time between object validity constant, would be to allow incomplete version directories for the head+1 version. I'm not saying this is the best approach, it's just the first idea that comes to mind.

pwinckles commented 2 years ago

I had considered write-locking objects during validation, at least in ocfl-java, but ended up simply noting in the docs that updating an object while it's being validated will produce inaccurate results. I'm struggling to remember now if there was a specific reason why I decided not to lock.

srerickson commented 2 years ago

My sense is the urgency of this issue will depend on the overall architecture in which OCFL is used. If operations (access/validation/update) for objects in a storage root are coordinated by a single server process, preventing simultaneous validations and updates is more straightforward. On the other hand, distributed (e.g., git-like) architectures might be easier to implement if transitions between objects states during updates were more predictable.

neilsjefferies commented 1 year ago

Merged into #372