SEMICeu / DCAT-AP

This is the issue tracker for the maintenance of DCAT-AP
https://joinup.ec.europa.eu/solution/dcat-application-profile-data-portals-europe
72 stars 24 forks source link

What does conformance to DCAT-AP mean for a data portal? #198

Closed sabinem closed 2 years ago

sabinem commented 3 years ago

Section 6.2 "Receiver requirements" of DCAT-AP 2.0.0 states in this regards: "In order to conform to this Application Profile, an application that receives metadata MUST be able to:•Process information for all classes specified in section 3.•Process information for all properties specified in section 4. •Process information for all controlled vocabularies specified in section 5.2. As stated in section 3, "processing" means that receivers must accept incoming data and transparently provide these data to applications and services. It does neither imply nor prescribe what applications and services finally do with the data (parse, convert, store, make searchable, display to users, etc.)."

So in case of a ckan portal does that mean all classes and properties in DCAT-AP must be both imported and exported by the ckan dataportal if provided by a datapublisher?

Currently our Swiss open dataportal harvests only datasets and distributions. But if I understand that above paragraph correctly: a data publisher should be able to use also other classes of DCAT-AP, such as for example a catalog record. While the Swiss data portal may ignore that catalog record class for display purposes, it should still be able to export the class correctly, when it is harvested by the European dataportal.

This seems a strong requirement and hard to implement in CKAN. Therefore before I suggest something like this, I want to be sure that I understood that correctly.

Thanks in advance for any help to clarify this.

bertvannuffelen commented 3 years ago

You have to read it as follows: let's consider the Swiss Data Portal as application.

a) The Swiss Data Portal is compliant/conform with DCAT-AP as harvester if it accepts/processes all entities as described in the application profile. So if a provider offers a DCAT-AP feed including a catalog record, then this should not raise any system exception by the Swiss Data Portal when it is claiming full DCAT-AP conformance. If though may decide while processing the provided data feed to not import an entity into the aggregated catalog. That is a business decision. E.g. all data services are ignored In such case it is best to notify the sources about this decision.

b) The Swiss Data Portal is compliant/conform with DCAT-AP as publisher if it produces a data export containing all entities as described in the application profile. So if the Swiss Data Portal produces a DCAT-AP feed, then the content and the usage of the properties and classes should be correspondence with the semantics in the application profile. As DCAT-AP does not enforces catalog records the Swiss Data Portal is not obliged to include them the feed, however if it is included then it must be included according to the semantics in the application profile. For instance for each catalog record modification date should be included.

c) If the Swiss Data Portal is both a DCAT-AP harvester as publisher then it should be expected that any dataset, data service and distribution is passed through as is. If the city of Zurich provides a REST API declared as data service then this should be a data service on the output, and not as a distribution. The last is altering the semantics.

Independent of this, there is an internal coherency issue for a CKAN portal w.r.t. DCAT-AP. As you probably know CKAN has a different terminology (https://docs.ckan.org/en/2.9/user-guide.html#datasets-and-resources) than DCAT. That means that importing and exporting DCAT-AP data is a mapping challenge. Not solely a business mapping, but also developers have to be aware as the CKAN API is in CKAN terminology and not in DCAT terminology. For instance I do not know how one makes a distinction between a dataservice and a distribution in CKAN, but I expect that without additional effort this cannot be realized. This challenge is not new, but from the DCAT-AP specification perspective, it is for the CKAN DCAT-AP community to address this mapping challenge in such a way the specification guidelines can be met.

sabinem commented 3 years ago

@bertvannuffelen Thank you very much for taking so much time to explain this to me so thoroughly. Especially the examples helped me a lot to get this clearer in my head. Your answer confirms, what I was worried about. And yes, implementing this with CKAN seems to be really a challenge, But I also agree that the software needs to adapt to DCAT-AP and not the other way around.