Open ddeboer opened 3 years ago
Proposed well-known-URI for datacatalogs (inspired by https://www.w3.org/TR/void/#well-known) for inclusion in Requirements for Datasets:
Discovery with well-known URI
The RFC 5785 defines a mechanism for reserving 'well-known' URIs on any Web server.
The URI /.well-known/datacatalog on any Web server is registered by this specification for a datacatalog with dataset descriptions of datasets hosted on that server. For example, on the host www.example.com, this URI would be http://www.example.com/.well-known/datacatalog.
This URI may be an HTTP redirect to the location of the actual datacatalog file. The most appropriate HTTP redirect code is 302. Clients accessing this well-known URI MUST handle HTTP redirects.
The datacatalog file accessible via the well-known URI should contain descriptions of all datasets hosted on the server. This includes any datasets that have resolvable URIs, a SPARQL endpoint, a data dump, or any other access mechanism whose URI is on the server's hostname. Datacatalogs can be described using http://www.w3.org/ns/dcat#Catalog or https://schema.org/DataCatalog.
This document defines the “.well-known” URI datacatalog using the registration procedure and template from Section 5.1 of RFC 5785 as follows:
URI suffix: datacatalog Change controller: W3C Specification document(s): This document.
Example (for testing):
curl -I https://www.openarch.nl/.well-known/datacatalog
HTTP/2 302
location: https://www.openarch.nl/datasets/
Impact on the Register function (Design):
Please note that .well-known suffix 'datacatalog' is not a registered suffix, see https://www.iana.org/assignments/well-known-uris/well-known-uris.xhtml. Do we know what consequences using an unregistered suffix might have? Are we breaking standards when using an non registered suffix?
Next step - if there are no comments on the text - is to include the proposed text into the our requirements document so we have a referencable document. Then we can send a request to have 'datacatalog' included in the list via https://github.com/protocol-registries/well-known-uris
I think for a formal registration of the 'datacatalog' suffix we should seek broader support in the (DCAT) community as it makes no sense, and probably has little chance to succeed, to do this only from the Dutch Digital Heritage perspective. Maybe we could consult Ruben Verborgh, Antoine Isaac or Herbert Van de Sompel as they have been involved in the Dataset Exchange Working Group (see https://www.w3.org/2020/02/dx-wg-charter.html) in one way or the other.
I have posted the issue Improve discovery of datacatalogs by registering well-known suffix 'datacatalog' at https://github.com/w3c/dxwg/issues/1290 and https://github.com/schemaorg/schemaorg/issues/2827 to seek support of these communities.
Based on: