OCFL / Use-Cases

A repository to help capture, track, and discuss use cases for OCFL. Issues-only, please.
7 stars 0 forks source link

Adding data to inventory.json #37

Closed bcail closed 1 year ago

bcail commented 4 years ago

This is for collecting use-cases where a user might want to add some data to inventory.json. In OCFL 1.0, inventory.json is completely locked down, with no possibility of adding an extra key or extra data (even with an extension).

Note: some or all of the use cases here have other options besides using inventory.json. It is possible to use inventory.json as defined in 1.0, and use other options for these use-cases. However, that does not mean that there isn't interest from users in adding data to inventory.json, and some might choose to do so if that's an option.

Mime Types

I am writing an OCFL HTTP layer, and when serving out a file, I want to serve it with the correct Content-Type header. This HTTP layer could be used by different repositories that store technical metadata in different files and formats. One way to have a standard location for mime types would be to define a "mimetypes" key under an "extensions" key in inventory.json, and store the mime type for each file (as identified by its checksum).

Related Note: we are coming from Fedora 3, where the mime type is stored in the FOXML, not in a datastream. Of course we can put the mime type in a content file if we choose to, but using inventory.json would have some similarities to the FOXML from Fedora 3.

IU tape preservation

See https://github.com/OCFL/spec/issues/474#issuecomment-642919476, https://github.com/OCFL/spec/issues/474#issuecomment-643255722, https://github.com/OCFL/spec/issues/474#issuecomment-643435360 for @bdwheele comments about adding data to inventory.json.

Suggestion

Open up the spec to allow extensions to define data that can be placed in the "extensions" object in inventory.json. Any validation tools must ignore the "extensions" data if they don't understand it, but validate the data if they do. So a basic OCFL validation tool that doesn't support any extensions would make sure inventory.json parses, validate all the spec keys, ignore the "extensions" data, and report the object as valid.

Any extensions that are submitted can be analyzed via the extension process to see if they would be a good fit for inventory.json.

bdwheele commented 4 years ago

Obviously a +1 from me :)

rosy1280 commented 4 years ago

@bcail it seems like your use case might be solvable with an extension, correct?

bdwheele commented 4 years ago

I'm going to betray my ignorance here, but if it's implemented as an extension, it would mean creating an additional file, correct?

Since my archival storage is tape-based, it is substantially more efficient for me to have metadata in a single file so if I need to retrieve information for an object it is a single tape read, rather than finding several files which may not even be on the same tape.

Additionally, if I'm updating an OCFL object fewer files would need to be overwitten meaning that the number of "holes" in the tapes would be minimized.

bcail commented 4 years ago

@rosy1280 no - an extension is only possible if the spec is updated (if I'm mis-reading the spec, let me know, but my understanding is that an extension is not allowed to add data to inventory.json)

rosy1280 commented 1 year ago

Update the language in the [3.1 Object Structure[(https://ocfl.io/draft/spec/#object-structure) to be more specific about what are the "following sections" are.

The OCFL Object Root must not contain files or directories other than those specified in sections 3.2 - 3.9

Or something like that... see OCFL/spec#637

julianmorley commented 1 year ago

One possible solution: An extension that allows for you to store the complete current version inventory.json in the /extensions directory that also contains any additional JSON keys the local organization requires. The local tooling would know to read the modified inventory.json in the extensions directory instead of the vanilla inventory.json for operations that wish to address the inventory and these new elements in a single file read.

awoods commented 1 year ago

Another possible solution: An application profile could define an application-specific file that includes the desired metadata.

rosy1280 commented 1 year ago

@bdwheele thoughts on the most recent comments editors added?

neilsjefferies commented 1 year ago

Several mechanisms for achieving this have been discussed. Editors feel we can close this since there hasn't been any further discussion.

bdwheele commented 9 months ago

(sorry for the extended delay in responding -- got pulled into another project and got totally sidetracked) Between application profiles and extensions, it seems like we should be able to do what was originally envisioned.