datacite / schema-docs

4 stars 3 forks source link

Using Distribution for a collection of files — DataCite Metadata Schema 4.5 documentation #17

Closed utterances-bot closed 1 year ago

utterances-bot commented 2 years ago

Using Distribution for a collection of files — DataCite Metadata Schema 4.5 documentation

https://datacite-metadata-schema.readthedocs.io/en/4.5_draft/guidance/distribution.html

paulmillar commented 2 years ago

I would like to point out that there is a possible alternative to BagIt (when used as described in the recommendation) called Metalink (RFC 5854).

When used as a container for links to data, BagIt and Metalink have considerable overlap in functionality. I believe the information in any BagIt file (with an empty /data directory) could be expressed using Metalink without loosing any information.

However, the Metalink format appears to be more flexible. It supports richer file-level metadata (not just checksums and file size), which may be extended to include arbitrary metadata.

There's also several independent clients supporting the Metalink software, whereas BagIt seems to be only supported by software from Library of Congress.

Given the target audience, I understand the motivation for recommending BagIt; however, I was wondering whether Metalink might be an interesting alternative, at least for certain use-cases.

paulmillar commented 2 years ago

... one other comment: field 21.a "mediaType" has cardinality of 1: it is a required element.

I did a quick search and I could find no MIME-Type for BagIt.

The underlying file format is zip. Therefore, BagIt files would likely be identified as either application/octet-stream or application/zip. Neither informs the DataCite metadata consumer that the distribution is a BagIt file.

I suggest someone registers a MIME-Type (e.g., "application/bagit+zip") with IANA, or contacts the Bagit community to understand why this hasn't already happened.

paulmillar commented 2 years ago

One further comment, I believe what DataCite is doing here is describing a BagIt profile.

It may be helpful to do this in a more formal way; for example, using the BagIt profiles specification.

paulmillar commented 2 years ago

It seems the BagIt profile language is currently not powerful enough to describe the above profile. I've opened an issue describing this problem.