Open H-a-g-L opened 9 months ago
I would add to this that maybe we should not keep dcterms:hasPart on the catalog, instead treat it as being replaced by dcat:catalog. For sure, keeping dcterms:hasPart as well is possible as it is inherited from cataloged resource. But I see no reason for having both in DCAT-AP.
Upon further reflection and in consideration of https://github.com/w3c/dxwg/issues/1454#issuecomment-1054653629 , dcat:catalog cannot be used to replace the dct:hasPart
in the JRC instance because the datasets of the "child" catalogues should be inferred as contained also in the "parent" catalogue. At the same time, IMHO, a review of the range of dct:hasPart would be useful.
Ok, thanks for pointing this out, had missed that. So, I guess we need to decide what use-case we are trying to fulfill:
I see a need for 2, not 1. For instance an organization that maintains two catalogs: Catalog A is maintained manually Catalog B is created via a transform from another system.
In this case it makes sense to let catalog A point to catalog B via dcterms:hasPart and an harvesting mechanism may be allowed to merge the two.
I think indeed a concrete example should be created here to see the distinction. So what is initial sitation, the operation that happens and then the resulting catalogue structure.
@ODP-hil can you create such examples to aid this issue forward?
Indeed @matthiaspalmer 's use-case 2 is the one we need to accommodate. To give an example:
<dcat:Catalog rdf:about="https://data.jrc.ec.europa.eu/">
<dct:identifier rdf:datatype="http://www.w3.org/2001/XMLSchema#anyURI">
https://data.jrc.ec.europa.eu/</dct:identifier>
<dct:hasPart rdf:resource="http://data.jrc.ec.europa.eu/collection/datam"/>
<…>
<dcat:Catalog rdf:about="https://data.jrc.ec.europa.eu/collection/datam">
<dct:identifier rdf:datatype="http://www.w3.org/2001/XMLSchema#anyURI">
https://data.jrc.ec.europa.eu/collection/datam</dct:identifier>
<dct:isPartOf rdf:resource="https://data.jrc.ec.europa.eu/"/>
<dcat:dataset rdf:resource="http://data.europa.eu/89h/1ba64b54-246f-4888-8824-080971c46145"/>
<dcat:dataset rdf:resource="http://data.europa.eu/89h/5a06cad1-6c12-4d17-b008-4b58956ec3d8"/>
<…>
When data.europa.eu harvests the JRC catalogue, the datasets of sub-catalogues are attributed directly to the parent catalogue. This is supported by the use of the inverse property dct:isPartOf for the dataset
<dcat:Dataset rdf:about="http://data.europa.eu/89h/1ba64b54-246f-4888-8824-080971c46145">
<dct:identifier rdf:datatype="http://www.w3.org/2001/XMLSchema#anyURI">http://data.europa.eu/89h/1ba64b54-246f-4888-8824-080971c46145</dct:identifier>
<dct:isPartOf rdf:resource="http://data.jrc.ec.europa.eu/" />
I think it is the case that dcat:catalog and the usage of dct:hasPart in DCAT-AP for a catalogue coincide.
Can one explain what is the difference between 1 and 2 case? I do not see the need for making a distinction between (sub) catalogues that are in scope of the DCAT-AP aggregation catalogue but not harvestable and harvestable DCAT catalogues by using a different property. That feels uncomfortable: so can someone aid me here to explain the semantical difference between both properties?
For example: the case below uses dct:hasPart for an harvestable Catalogue while dcat:catalog for a non-harvestable.
:c1 a dcat:Catalog;
dct:hasPart :c2.
dcat:catalog :c3.
:c2 a dcat:Catalog;
dct:title "I am a harvestable Catalog".
:c3 a dcat:Catalog;
dct:title "I cannot be harvested".
@matthiaspalmer and @H-a-g-L during the last webinar there was confusion about the semantics or expected behaviour for dcat:catalog versus dct:hasPart (in the context of DCAT-AP).
In DCAT 3: the definition for dcat:catalog is a catalog that is listed in the catalog. In DCAT-AP 3 the definition for dct:hasPart is related Catalogue that is part of the described Catalogue.
So can you clarify what is the difference between listed and part of in your opinion?
@bertvannuffelen for me listed is very close to member of in the set theoretical perspective. While part of is more open and could be interpreted in many ways. We have the need to indicate a subset relation, hence has part is the only option viable as it wide enough to include that interpretation.
I understand there is a difference between a member and a subset in set theory, but the definition is not clear about which one is intended.
So lets analyse the case and try to understand the case.
I think you will agree that if a Catalogue is considered a set then the members of a Catalogue are the Catalogued Resources.
So the question is whether the referenced catalogues by the property dcat:catalog must be part of the Catalogued Resource/s for that Catalogue or not.
W3C DCAT does not make a textual statement about it.
However W3C DCAT states that dcat:catalog is a subproperty of dcat:resource. And dcat:resource is the membership property indicating that a Catalogued Resource is a member of a Catalogue. (We all use dcat:dataset in that meaning.) As a consequence dcat:catalog is thus a specialisation of the membership property. And thus, the referenced catalogues by dcat:catalog must be members of the Catalogue.
Following the above reasoning it means that our use of dct:hasPart as subset declaration: we have aggregated the referenced catalogues into one, is not covered by this property.
proposal
For the domain
dcat:Catalog
both propertiesdct:hasPart
("a related Catalogue that is part of the described Catalogue") anddcat:catalog
(sub-property of dct:hasPart - "a catalogue whose contents are of interest in the context of this catalogue") have the range ofdcat:Catalog
. DCAT 3 revises the definition ofdcat:catalog
(https://github.com/w3c/dxwg/issues/1156 "A catalog that is listed in the catalog") and introduces the propertydcat:resource
(sub-property ofdct:hasPart
) as the parent property ofdcat:catalog
as well as ofdcat:dataset
anddcat:service
. The new property should only be used when none of the available sub-properties can be.To more closely align to DCAT 3 and remove the ambiguity, I suggest changing the usage note to more clearly indicate that dcat:catalog should be used to link between "parent" and "child" catalogues. Likewise, the range of dct:hasPart should be changed to dcat:Resource. However, current implementations exist where dct:hasPart is used to link catalogues (cfr. EU Open data portal, JRC data catalogue). In principle, these should not be in violation of the proposed revision because dcat:catalog is a sub-prperty of dct:hasPart and dcat:Catalog sub-class of dcat:Resource.