International-Data-Spaces-Association / InformationModel

The Information Model of the International Data Spaces implements the IDS reference architecture as an extensible, machine readable and technology independent data model.
Apache License 2.0
62 stars 35 forks source link

Achieve compatibility with DCAT 2 #109

Open clange opened 4 years ago

clange commented 4 years ago

The W3C Data Exchange Working Group (https://www.w3.org/2017/dxwg/) is working on DCAT 2. DCAT 2 made it to the Candidate Recommendation stage on 3 Oct 2019 (https://www.w3.org/TR/vocab-dcat-2/). Considering the

we should make sure that our information model is compatible with DCAT 2.

Some relevant features of DCAT 2 include:

JohannesLipp commented 4 years ago

Additions: We need to verify that...

JohannesLipp commented 4 years ago

My first investigation is on where we currently use the dcat prefix. This is in 19 files (open tasks marked in bold):

Please note that for comparison, I refer to the DCAT-1 and DCAT-2 Turtle files [1,2] as well as their documentations [3,4].

[1] https://www.w3.org/ns/dcat2014.ttl [2] https://www.w3.org/ns/dcat2.ttl [3] https://www.w3.org/TR/2014/REC-vocab-dcat-20140116/ [4] https://www.w3.org/TR/vocab-dcat-2/

JohannesLipp commented 4 years ago

Detailed investigations and explanations:

JohannesLipp commented 4 years ago

Coming to the major changes from DCAT-1.1 to DCAT-2, which are of interest for us:

JohannesLipp commented 4 years ago

The work for DCAT-2 is done. Compatibility with DCAT-AP is done in issue #277

clange commented 3 years ago

@JohannesLipp currently reviewing your investigations. Re ids:VocabularyData I would suggest (could you please do it if it's not yet done?) that we open a separate issue for getting rid of that class? It was a workaround for adding some of the domain-specific structure/semantics features at a time when CodeGen was not yet able to handle terms from non-IDS namespaces.

clange commented 3 years ago

After reviewing, the following questions remain to be asked to DCAT experts.

clange commented 3 years ago

Re. dcat:mediaType I think we shall take the right decision in the context of our ongoing discussion on how to replace our media type code lists by something more lightweight that can take any standard or non-standard string of the form "type/subtype". @JohannesLipp could you please link to that issue from here, or in any case make sure we have an issue for that discussion? (The discussion would be similar to #296.) My input to that discussion is that I think we should not represent media types simply as string literals but indeed continue to represent them as instances of ids:MediaType, but make sure that additional types can be used easily: the most lightweight representation would be ex:MyDataResource ids:mediaType [ rdfs:label "foo/bar" ], such that the blank node would implicitly be of type ids:MediaType and thus also of dct:MediaType. It does make sense to remain compatible with dct:MediaType, and the good thing is that its specification is so vague that it doesn't restrict us to anything other than modelling media types as resources.

clange commented 3 years ago

@JohannesLipp in https://github.com/International-Data-Spaces-Association/InformationModel/pull/270 (How do you easily/directly link to a pull request?) I did not see anything about the first bullet point in the comment about dct:spatial. Did you also cover that?

clange commented 3 years ago

This comment is a placeholder for some more DCAT2 features I'd like to request to be supported by the IDS infomodel. At the very least we should go through the full list of changes from DCAT 1 to 2 once more. I think at least dcat:DataService is related to ids:Endpoint in a way that we have not yet considered here (see https://www.w3.org/TR/vocab-dcat-2/#Class:Data_Service), and there may be further terms.

JohannesLipp commented 3 years ago

After reviewing, the following questions remain to be asked to DCAT experts.

  • Is it OK to have ids:theme rdfs:subPropertyOf dcat:theme; rdfs:domain [ rdfs:subClassOf dcat:Dataset ]? I think it is, because we are talking about a more specific property, and that property is not mandatory.

I would say yes. Currently, the domain is ids:DigitalContent, which is a subclass of dcat:Dataset. Your suggestion using a blank node would therefore replace the range ids:DigitalContent with the more generalize one "anything extending dcat:Dataset

JohannesLipp commented 3 years ago

Re. dcat:mediaType I think we shall take the right decision in the context of our ongoing discussion on how to replace our media type code lists by something more lightweight that can take any standard or non-standard string of the form "type/subtype". @JohannesLipp could you please link to that issue from here, or in any case make sure we have an issue for that discussion? (The discussion would be similar to #296.) My input to that discussion is that I think we should not represent media types simply as string literals but indeed continue to represent them as instances of ids:MediaType, but make sure that additional types can be used easily: the most lightweight representation would be ex:MyDataResource ids:mediaType [ rdfs:label "foo/bar" ], such that the blank node would implicitly be of type ids:MediaType and thus also of dct:MediaType. It does make sense to remain compatible with dct:MediaType, and the good thing is that its specification is so vague that it doesn't restrict us to anything other than modelling media types as resources.

IMHO there is no action needed from our side. dcat:mediaType has range dct:MediaType, and ids:mediaType and ids:MediaType extend these, respectively. We discussed this in #224 and the compact result (following DCAT2 is the following:

:Foo ids:mediaType <http://www.iana.org/assignments/media-types/text/csv> ;

JohannesLipp commented 3 years ago

@JohannesLipp in #270 (How do you easily/directly link to a pull request?) I did not see anything about the first bullet point in the comment about dct:spatial. Did you also cover that?

You just did that direct link to a pull request in that comment 😃 Thank you for the info, I have not covered that indeed. I solved it via the most recent commit, which we agreed on in today's Infomodel call.

clange commented 3 years ago

@JohannesLipp in investigating the reuse of the IDS infomodel for the Agricultural Information Model of https://h2020-demeter.eu/, where DCAT was given as the baseline, I identified the following missing points:

JohannesLipp commented 3 years ago

ids:spatialCoverage extends dct:spatial. We however do not use this particular resolution in meters yet.

  • Also I think my earlier comment on thinking about the relation between dcat:DataService and ids:Endpoint got lost.
lcomet commented 7 months ago

Related to #593