netwerk-digitaal-erfgoed / requirements-datasets

Requirements for datasets
https://netwerk-digitaal-erfgoed.github.io/requirements-datasets/
1 stars 0 forks source link

Distinguish distribution types #45

Open ddeboer opened 3 years ago

ddeboer commented 3 years ago

How can we make the type of distribution more explicit? A distribution currently looks like:

{
  "@id": "http://vocab.getty.edu/aat/sparql",
  "@type": "DataDownload",
  "encodingFormat": "application/sparql-query",
  "contentUrl": "http://vocab.getty.edu/sparql"
}

The type is determined based on schema:encodingFormat. We may distinguish:

Type encodingFormat Explanation
SPARQL endpoint application/sparql-query request content type
RDF file dump text/turtle, application/ld+json etc. MIME type of the file
Non-RDF file dump text/csv etc. MIME type of the file
Content-negotiated RDF ? How to express same contentUrl providing multiple MIME types?
OAI-PMH endpoint ? How to express OAI-PMH endpoint which is underdetermined by just application/xml?
Zipped file ? Schema.org suggests application/zip but that underdetermines the type of file contained in the archive.
coret commented 2 years ago

The application/sparql-query shouldn't be used as this is the request content type. What it intended (and expected) is the content type of the response. The default content type (if no accept header n request is given) is not defined.

I think we must apply the same logic to the "Content-negotiated RDF", so stick with the default (no accept header in request) content-type of the output.

I regards to the OAI-PMH endpoint, the (The Open Archives Initiative Protocol for Metadata Harvesting)[http://www.openarchives.org/OAI/openarchivesprotocol.html#XMLResponse] specification says:

All responses to OAI-PMH requests must be well-formed XML instance documents.

An OAI-PMH endpoint could provide XML data in one or more MetadataFormats, but this can be discovered by using verb=ListMetadataFormats

In the case of compressed files, which contain one of more files of only one type, to use the MIME type of the contents, rather than application/zip or application/gzip.

coret commented 1 year ago

See https://github.com/netwerk-digitaal-erfgoed/requirements-datasets/issues/67 https://github.com/netwerk-digitaal-erfgoed/requirements-datasets/issues/64 and https://github.com/netwerk-digitaal-erfgoed/requirements-datasets/issues/68