SEMICeu / DCAT-AP

This is the issue tracker for the maintenance of DCAT-AP
https://joinup.ec.europa.eu/solution/dcat-application-profile-data-portals-europe
74 stars 24 forks source link

Max cardinality for dct:creator #171

Closed andrea-perego closed 2 years ago

andrea-perego commented 3 years ago

Currently, the max cardinality of dct:creator is set to 1 instead of N.

This implies that a dataset or a catalogue can have at most 1 author/creator, which is not the case.

bertvannuffelen commented 3 years ago

This is an question to the community.

In general I agree there is the possibility that there are multiple author/creators for a catalogue or dataset.

But the definition (section 4.4.3)

This property refers to the entity primarily responsible for producing the dataset

already addresses the question by the word primarily into the definition.

This limitation has been set based on the DCAT-AP objective to aid Open Data Portals to communicate a dedicated minimal amount of information that is helpful for the possible reuser to make an reuse decision. So although there might be multiple creators/authors of the dataset, for the objective of an Open Data Portal is one sufficient.

NOTE: This issue raises also opens the discussion on the max-cardinality requirements on other roles. Should they be revised too?

andrea-perego commented 3 years ago

Thanks, @bertvannuffelen , for explaining the rationale behind the cardinality constraint on dct:creator.

About this point:

This limitation has been set based on the DCAT-AP objective to aid Open Data Portals to communicate a dedicated minimal amount of information that is helpful for the possible reuser to make an reuse decision. So although there might be multiple creators/authors of the dataset, for the objective of an Open Data Portal is one sufficient.

Such assumption depends very much on the actual purpose of having dct:creator in DCAT-AP. E.g., it may be not only related to reuse, but also to acknowledge the parties who contributed to the creation of a resource - which involves provenance, accountability, and IPR aspects.

Actually, one of the reasons why dct:creator was included in DCAT2 was to address one of the requirements of research data - namely, data citation (see the note in §6.4.4)

For this specific use case, restricting the max cardinality of dct:creator to 1 creates a barrier for the use of DCAT-AP for documenting research data - while a number of initiatives, as the European Open Science Cloud (EOSC) are working toward this objective, and not to mention that (public-funded) research data fall in the scope of the Open Data Directive.

Besides this, there's another more practical aspect to take into account, i.e., how dct:creator has been (and is being) used in existing DCAT-AP extensions, and whether any max cardinality constraint has been specified.

Looking at the national extensions documented in the 2017 report on Joinup, those using it at the time the report was written (DCAT-AP.de, DCAT-AP_IT, DCAT-AP-NO) have no max cardinality constraint.

Other examples are GeoDCAT-AP, which supports this property since its first release, and - talking about research data - the DCAT-AP extension (CiteDCAT-AP) used by Zenodo (the most widely used research data repository in Europe, currently including ~2M metadata records).

This situation suggests that allowing the specification of multiple authors meets cross-border and cross-domain requirements.

jakubklimek commented 2 years ago

Looking at the national extensions documented in the 2017 report on Joinup, those using it at the time the report was written (DCAT-AP.de, DCAT-AP_IT, DCAT-AP-NO) have no max cardinality constraint.

Technically speaking, this just means they are all incorrect, as they violate the cardinality constraint of DCAT-AP, which a profile should not do.

But from the interoperability perspective, I see this as an unnecessary restriction and unless there is a clear EU-specific reason to limit this to one, I would remove the restriction, and I would remove them from the other roles as well.

init-dcat-ap-de commented 2 years ago

We vote for multiple creators.

giorgialodi commented 2 years ago

Looking at the national extensions documented in the 2017 report on Joinup, those using it at the time the report was written (DCAT-AP.de, DCAT-AP_IT, DCAT-AP-NO) have no max cardinality constraint.

Technically speaking, this just means they are all incorrect, as they violate the cardinality constraint of DCAT-AP, which a profile should not do.

But from the interoperability perspective, I see this as an unnecessary restriction and unless there is a clear EU-specific reason to limit this to one, I would remove the restriction, and I would remove them from the other roles as well.

Just a remark on this, since the Italian extension has been mentioned. The Italian extension is still based on the first version of DCAT and then of DCAT-AP where dct:creator was not included. It was Italy that, at that time (2016), decided to include the property in order to meet exactly the objectives that @andrea-perego explained so well above. Really do not see the point to restrict this type of property, since for provenance purposes it is better to acknowledge all possible contributors to the creation of the dataset. The data portal can then show just one according to its own reasons, but from a metadata perspective, documenting all the steps regarding the creation of the dataset and the different involved actors is valuable. Therefore, although I am no longer in the position to officially vote for my country on this, I will definitely go for multiple creators.

H-a-g-L commented 2 years ago

Although in favour of changing max cardinality for dct:creator of dcat:Dataset, this should be assessed also in consideration of the range defined by DCAT-AP 2.0.1 as foaf:Agent. For a foaf:Agent creator the 0...1 cardinality makes sense and should be kept. If, however, range is relaxed to allow also foaf:Person then the 0...n cardinality would make more sense and would support data citation. For reference, on data.europa.eu some 7000+ datasets have a foaf:Person creator harvested from these catalogues.

init-dcat-ap-de commented 2 years ago

First point: Every foaf:Person is also a foaf:Agent, so a foaf:Person is already allowed. Second point: You are right, with the range of foaf:Agent, multiple creators could be already modeled as foaf:Group, with every foaf:Person as foaf:member of the group. I never considered this... Doing it this way would add additional meaning, that the creators are some kind of group. One could argue that they are at least the "group of all creators"...

If we go this route, we should

Currently I am not sure if it wouldn't be easier to simply allow multiple creators.

bertvannuffelen commented 2 years ago

During WG 21 Oct 2021, the wg agreed to the proposal to lift the cardinality.