ESIPFed / science-on-schema.org

science-on-schema.org - providing guidance for publishing schema.org as JSON-LD for the sciences
Apache License 2.0
109 stars 31 forks source link

Adopt DCAT accessURL, compressFormat and packageFormat #191

Open smrgeoinfo opened 2 years ago

smrgeoinfo commented 2 years ago

Current schema:distribution has a schema:DataDownload value that only has a contentURL and encodingFormat property. This does no support clear indication whether the distribution link is to a landing page, feed, SPARQL endpoint, or web application to request data, or is a direct download link to get a file that contains the data. DCAT solves this problem with dcat:accessURL and dcat:downloadURL.

A direct download link (dcat:downloadURL) might get a file containing data in a format represented by the schema:DataDownload/schema:encodingFormat string, but the dataset might be in a compressed file, or the dataset might be in a data package like Bagit, or a TAR bundle. For machine-actionable processing, it is very useful for the dataset distribution to be explicit about what the download URL is going to get. DCAT solves this problem by adding properties for compressFormat and packageFormat.

I propose that SOSO should add recommendation to use dcat:accessURL, dcat:downlowadURL, dcat:compressFormat and dcat:packageFormat, along with schema:contentURL and schema:encodingFormat.