Open coret opened 2 years ago
To be clear: this will only work when a catalog is registered, not an individual dataset.
How should we model this relation in DCAT?
<catalog> dcat:dataset <dataset>
?dcat:dataset
(in other words, the equivalent of schema:includedInDataCatalog
)? dcat:catalog
is weird because its domain is dcat:Catalog
rather than dcat:Dataset
.Should we also store the catalog itself with any metadata provided by the user?
To be clear: this will only work when a catalog is registered, not an individual dataset.
Correct
How should we model this relation in DCAT?
I'm inclined to say option 2 as the Dataset Register is about dataset(description)s. I think dcat:catalog (BTW: DCAT 2 or DCAT3?) is equivalent with schema:includedInDataCatalog (which has schema:DataCatalog as domain).
Should we also store the catalog itself with any metadata provided by the user?
The catalog (if provided or linked via schema:includedInDataCatalog
in dataset *) might contain interesting metadata, especially in the light of the need for some organisation to provide information about some kind of compound dataset.
Is crawling, validating, storing and querying datacatalogs straightforward?
*) I just realized that we could use schema:includedInDataCatalog
as a discovery mechanisme (in case a dataset is registered and this property is present). But maybe organization deliberately only provide some datasets as they are heritage specific and other (from the catalog) are not...
I'm inclined to say option 2 as the Dataset Register is about dataset(description)s. I think dcat:catalog (BTW: DCAT 2 or DCAT3?) is equivalent with schema:includedInDataCatalog (which has schema:DataCatalog as domain).
Both in DCAT 2 and 3 dcat:catalog
has domain dcat:Catalog
, so can only be applied to dcat:Catalog
(which is a subclass of dcat:Dataset
). So we’re still looking for the inverse of dcat:dataset
, of which the DCAT spec says:
However, recognizing that inverses are needed for some use cases, DCAT supports them, but with the requirement that they MAY be used only in addition to those described in 6. Vocabulary specification, and that they MUST NOT be used to replace them.
It mentions dcat:inCatalog
there, which seems to be what we’re looking for, although it should be used only in addition to dcat:dataset
.
The Dataset Register handles Datasets and DataCatalogs. When handling a DataCatalog all the descriptions of datasets are "extracted" and stored. The fact that a Dataset was part of a DataCatalog is only stored in the Dataset Register if the (recommended, but not required) schema:includedInDataCatalog property was provided in the dataset description.
In cases where dataset descriptions in a DataCatalog do not provide the (reverse) schema:includedInDataCatalog property, the Dataset Register could add this property to the dataset description, as this can be valuable information.