SEMICeu / DCAT-AP

This is the issue tracker for the maintenance of DCAT-AP
https://joinup.ec.europa.eu/solution/dcat-application-profile-data-portals-europe
74 stars 24 forks source link

dcat:Catalog - dct:isPartOf - cardinality 0..n instead of 0..1? #157

Closed jimjyang closed 2 years ago

jimjyang commented 3 years ago

From the public consultation of our DCAT-AP-NO which we had recently, there was a question about the cardinality of dcat:Catlog dct:isPartOf: Shouldn't it be 0..n since a dcat:Catalog may logically be part of more than one dcat:Catalog?

bertvannuffelen commented 3 years ago

In general this is probably true.

Now we have to take the application scope of DCAT-AP. The aim is to create a specification for catalogues for Open Data portals. In that context we have 2 main use cases:

The property dct:isPartOf appears_ in the second usecase. Harvesting creates obviously a catalogue Hcat that is composed of multiple source catalogues Scat_i. In that setting every Scat_i will be a part of the single catalogue Hcat. If Hcat is harvested itself into another catalogue, then a tree of catalogues appears. As long there is a tree hierarchy in the harvesting process no Scat_i will appear multiple times in the final catalogue and thus there is no need to being able to express being part of multiple catalogues.

One could argue that this tree hierarchy in harvesting is not always applied, or followed, enabling the case that a catalog could be part of more than one other catalog. The biggest consequence is that this brings additional challenges with the uniqueness of the dataset and dataservice descriptions in the final aggregating catalog.

The limiting of the cardinality to 1 forces into a tree-shaped catalog structure. That is easy to understand, but it unfortunately will not remove the challenge of uniquess of dataset and dataservice descriptions. Very often open data portal software will create new identifiers for the datasets and dataservices descriptions in their catalog after harvesting another catalog. And only these new identifiers are being used to exchange with the following catalog. Only if the open data portal software would keep a trace of the identifiers in the dct:identifier/adms:identifier attributes one can detect identical entries.

So back to your original question: is there a usecase where 0..1 blocks the application of DCAT-AP? Or vice versa, what do others think of relaxing the cardinality to 0..n?

Or even more, do we need to include it in DCAT-AP? It is implicitly present in DCAT (it is mentioned in usage guidelines, but not recorded as an explicit property). In DCAT all the inter-resource relation properties are packed into a usage guideline: use dct when possible, if not use a qualified relationship. See https://www.w3.org/TR/vocab-dcat-2/#qualified-forms I believe this guideline is clear, and unless we in the context of DCAT-AP would like to express stricter requirements I would adopt this approach as whole.

jimjyang commented 3 years ago

@bertvannuffelen Sorry for my late reply to your question. So far we don't have any practical use case which shows that 0..1 blocks the application of DCAT-AP.

jakubklimek commented 2 years ago

I also do not see 0..1 as restrictive at the moment, as we aim at tree-shaped structure.

bertvannuffelen commented 2 years ago

To the community:

current proposal is not to change anything in the DCAT-AP specification.

init-dcat-ap-de commented 2 years ago

We have currently no usecase for 0..n.

bertvannuffelen commented 2 years ago

During WG 21 Oct 2021, the wg decided not to change the specification and thus adopt the proposed resolution.