OCFL / Use-Cases

A repository to help capture, track, and discuss use cases for OCFL. Issues-only, please.
7 stars 0 forks source link

Preserve file mimetype #19

Closed tomwrobel closed 6 years ago

tomwrobel commented 6 years ago

For digital preservation purposes, and also for compatibility with the Fedora 4 API, an institution may wish to preserve the mimetype of a binary file as metadata within the OCFL file system.

ahankinson commented 6 years ago

Although this may be useful information to have, it may be out-of-scope for OCFL. File metadata, including mimetype, is best stored in application-specific metadata. The same file can have different mimetypes (application/xml and application/tei+xml would both be valid for a TEI file), so mimetype discovery is somewhat non-deterministic. It also does not provide any functionality for file versioning, as the mimetype is not needed.

zimeon commented 6 years ago

I also think this should be out-of-scope for direct OCFL support, but I think can be handled in an implementation-specific way through application-specific metadata or a possible extension of data in the inventory files (at some stage we'll need to decide how locked-down or extensible admin files are).

ahankinson commented 6 years ago

OK, so this is a useful test-case for our Use Case process. I think now we should give @tomwrobel a chance to respond to either make a further case, or permit us to mark this out-of-scope and close. Do we need more eyes on this? Or more input?

tomwrobel commented 6 years ago

TL;DR feel free to declare it out of scope

As I'm envisioning an OCFL-service, I want to be able to use OCFL in order to provide a service that is Fedora API compliant (for binary file saving, retrieval, etc.). If a piece of metadata that the Fedora API seems to require isn't stored in OCFL, then I would have to have a separate OCFL-service metadata store, and that way lies madness!

However, after discussion with digital preservation colleagues, I think a) @ahankinson's point about unreliable mimetypes is a good one and b) a mimetype can be generated on-the-fly if necessary from file extension/magic number information in any case.