w3c / dxwg

Data Catalog Vocabulary (DCAT)
https://w3c.github.io/dxwg/dcat/
Other
148 stars 47 forks source link

Possible mismatch between the stated scope in the non-normative text and the vocabulary specification #1186

Closed aidig closed 4 years ago

aidig commented 4 years ago

Problem statement:

There seems to be a possible mismatch between the stated goals and scope in the non-normative text and the vocabulary specification presented at https://www.w3.org/TR/vocab-dcat-2/.

In other words, according to the non-normative text, a dcat:Catalog in DCAT 2.0 may include many different assets - not just datasets and dataservices (if the appropriate subclasses are defined naturally), but dcat:Catalog is still - given the current definition in the vocabulary specification - limited to datasets and dataservices.

Examples

Example from non-normative text: (Status of this Document)

The main changes to the DCAT vocabulary have been: (..)

  • loosening of constraints in class and property definitions to promote re-use of terms and modularity
  • addition of a dcat:Resource class for representing any asset than can be included in the catalog, this is now the super-class of dcat:Dataset

Example from non-normative text: (5.1 DCAT scope) https://www.w3.org/TR/vocab-dcat-2/#dcat-scope

dcat:Resource represents a dataset, a data service or any other resource that may be described by a metadata record in a catalog. This class is not intended to be used directly, but is the parent class of dcat:Dataset, dcat:DataService and dcat:Catalog. Member items in a catalog should be members of one of the sub-classes, or of a sub-class of these, or of a sub-class of dcat:Resource defined in a DCAT profile or other DCAT application. dcat:Resource is effectively an extension point for defining a catalog of any kind of resource. dcat:Dataset and dcat:DataService can be used for datasets and services which are not documented in any catalog.

These statements do not seem to be reflected in the following examples from the vocabulary specification (the current definition and usage note). Please note that the important "change NOTE" is not formalized in neither table nor Turtle representation. This is not ideal.

Example from the description of dcat:Catalog in the Vocabulary specifcation https://www.w3.org/TR/vocab-dcat-2/#Class:Catalog

NOTE
The scope of DCAT 2014 was catalogs of datasets [VOCAB-DCAT-20140116]. This has been generalized, and properties common to all cataloged resources are now associated with a super-class dcat:Resource.
Moreover, an explicit class for data services has been added in this revision of DCAT, to enable these to be part of a catalog.
Finally, dcat:Catalog has been made a sub-class of dcat:Dataset, and provision for catalogs to be composed of other catalogs is also enabled.
See Issue #116 and Issue #172.
The following properties are specific to this class: catalog record, has part, dataset, service, catalog, homepage, themes.

The following properties are inherited from the super-class dcat:Dataset: distribution, frequency, spatial/geographic coverage, spatial resolution, temporal coverage, temporal resolution, was generated by. The following properties are inherited from the super-class dcat:Resource: access rights, conforms to, contact point, creator, description, has policy, identifier, is referenced by, keyword/tag, landing page, license, catalog language, relation, rights, qualified relation, publisher, release date, theme/category, title, type/genre, update/modification date, qualified attribution.
RDF Class: | dcat:Catalog
-- | --
Definition: | A curated collection of metadata about datasets and data services
Sub-class of: | dcat:Dataset
Usage note: | A Web-based data catalog is typically represented as a single instance of this class.
See also: | § 6.5 Class: Catalog Record, § 6.6 Class: Dataset

Or in ttl:

dcat:Catalog
  a rdfs:Class ;
  a owl:Class ;
  rdfs:comment "A curated collection of metadata about datasets and data services"@en ;
  rdfs:subClassOf dcat:Dataset ;
  rdfs:subClassOf [
      a owl:Restriction ;
      owl:allValuesFrom dcat:Resource ;
      owl:onProperty dct:hasPart ;
    ] ;
  skos:definition "A curated collection of metadata about datasets and data services."@en ;
  skos:scopeNote "A web-based data catalog is typically represented as a single instance of this class."@en ;

It is not a trivial issue, as the definition does determine the number of relevant instances. In fact, the current definition of dcat:Catalog corresponds to the term DataCatalog, although it might have been the intention from the start to create a more generic concept, ie. a catalog of resources. In which case, a change of definition would be an acceptable correction, but without this history one could argue that we are now presented with a new concept (that simply broadens the old concept) A vocabulary for a generic catalogue of resources would be a HIGHLY useful asset (!), but generally, a better description and formalization of this technically small, but semantically huge change would benefit all users and applications.

Proposal

The formal definition and scope notes of the relevant vocabulary elements ought to reflect the intended scope.

aidig commented 4 years ago

Related Use Cases from "W3C Editors Draft" 5.8 Scope or type of dataset with a DCAT description [ID8] https://w3c.github.io/dxwg/ucr/#ID8

5.20 Modelling resources different from datasets [ID20] https://w3c.github.io/dxwg/ucr/#ID20

riccardoAlbertoni commented 4 years ago

We have revised the definition of dcat:Catalog to make it more consistent.