SEMICeu / DCAT-AP

This is the issue tracker for the maintenance of DCAT-AP
https://joinup.ec.europa.eu/solution/dcat-application-profile-data-portals-europe
76 stars 24 forks source link

Data theme vocabulary: mapping ISO 19115 to DCAT #314

Open skornsekj opened 11 months ago

skornsekj commented 11 months ago

Description problem:
When mapping ISO 19115 metadata (MD_TopicCategoryCode - ISO 19139:2007) to DCAT-AP to the Dataset Theme Vocabulary you’ll see that a good mapping is not always possible. In those situations you have to pick a category that is not correct. In those cases you will create / introduce incorrect even if the theme in original ISO 19915 metadata are correct.

Some examples (nonlimitative) E.g. the ISO categories 'elevation' and 'imageryBaseMapsEarthCover' are difficult to map to Dataset Theme Vocabulary

Possible solutions:

Question Is it possible to implement one of the suggested solutions?

bertvannuffelen commented 11 months ago

@skornsekj,

This is one of the many harmonisation issues which arises always in categories or themes.

The DCAT-AP preferred way to address this is the following:

a) DCAT-AP: dcat:theme is restricted to the Dataset Theme Vocabulary b) for any other theming/category create subproperty of dcat:theme or dct:subject with the appropriate codelist

This approach will

And yes, this promotes implicitly the Dataset Theme Vocabulary as the prime theme list. But since this has been designed to cover the whole public administration domains (very broad thus) it is well suited for that.

Also such mapping between codelists is partially an arbitrary decision and often is not "perfect". Therefore this should be in the hands of those that need to act upon the mapping. As DCAT-AP community we cannot decide if which Dataset theme the ISO category "elevation" should have. We can mandate one should use a mapping, but maintaining a mapping without domain specific knowledge is hard and unfeasible. Thus if the INSPIRE community wants to come with an managed agreed mapping, then this could be included in the specification by reference.

Maybe I am oversimplifying but I believe there no need to make this mapping besides "supporting the editors of INSPIRE metadata to provide suggested value". And sharing that insight with the community. As such the portal can take benefit from the slightly different perspectives to support discoverability.

jakubklimek commented 11 months ago

I agree with @bertvannuffelen .

Just one minor comment - given that dcat:theme uses the Data theme vocabulary, I see as problematic to suggest that other vocabularies could use subproperties of dcat:theme in the RDFS sense for other vocabularies.

If I create a subproperty of dcat:theme, I expect it to share all restrictions of dcat:theme, including the mandatory vocabulary, which I view as a kind of range definition, even though this is not expressed in RDFS.

I would therefore suggest using dct:subject or its subproperties, where DCAT-AP has no similar restrictions.

idevisser commented 11 months ago

The problem is that the value list has a policy approach. Much geodata cannot be classified into a policy domain because it concerns a specific physical occurrence; topographical maps, aerial photographs of height and depth, etc. I am not in favor of polluting policy themes with data, where an arbitrary theme is given because it needs to be filled in. I am not in favor of contaminating policy themes with data, where an arbitrary theme is given because it needs to be filled in. This ultimately produces unusable information for the user. This situation can be prevented by adding an other value or not making the value list mandatory. That may be preferable, because it can also solve other problems.

The comment about the restrictions on subproperties raises another question for me; how can I include multiple themes that could come from different vocabularies?

jakubklimek commented 7 months ago

Just to make the issue description complete, there is a mapping of ISO 19115 topic categories and the INSPIRE spatial data themes and from there, a mapping of INSPIRE spatial data themes and the MDR data themes developed as part of the work on GeoDCAT-AP present in https://github.com/SEMICeu/iso-19139-to-dcat-ap/tree/master/alignments .

This however, does not resolve the issue that the mapping is not exact.