How should datasets be denoted as high-value datasets in their metadata? #3

Open hogredan opened 3 months ago

hogredan commented 3 months ago

According to the HVD Regulation, public sector bodies holding high-value datasets listed in the Annex shall ensure that the datasets are denoted as high-value datasets in their metadata description (Art. 3, 5).

In Germany, it has been decided that for all datasets in the national spatial data infrastructure that fall under the HVD Regulation, the category must be indicated in the ISO metadata as a keyword in combination with a source reference. This is to enable the central process of transforming ISO metadata into DCAT-AP metadata (permanent delivery towards the national Open Data Portal) and fulfil the requirements of DCAT-AP-HVD.

There are currently two options to declare the category in the metadata, either in free text (gco:CharacterString) or as a reference (gmx:Anchor). Both options can be processed by the above-mentioned transformation.

Example of category declaration in free text (gco:CharacterString)

<gco:CharacterString>High-value dataset categories</gco:CharacterString>
<gmd:CI_DateTypeCode codeList="" codeListValue="publication"/>

Example of the category declaration as a reference (gmx:Anchor)

<gmx:Anchor xlink:href="">Georaum</gmx:Anchor>
<gmx:Anchor xlink:href="">High-value dataset categories</gmx:Anchor>
<gmd:CI_DateTypeCode codeList="" codeListValue="publication"/>

How do the other Member States implement the denotation of high-value datasets in the ISO-metadata?

laers commented 1 month ago

I have made a similar proposal for tagging metadata for data sets that is in scope of HVD. But I have added a tag: A tag that indicates that the data set is in scope of HVD using anchor to the legislation. Besides this reference another tag specifying which HVD category the data set belongs to.

Tag saying that the data set is in scope of HVD (gmx:Anchor xlink:href=""):

INSPIRE Høj-værdi datasæt ... EU legislation ...

Another tag that specifies that the data set belongs to Earth observation and environment (da: Jordobservation og miljø) category (gmx:Anchor xlink:href="");

Jordobservation og miljø ... High-value dataset categories ...

Br Lars

hallinpihlatie commented 1 month ago

We're also planning to use of Anchors to refer to the HVD-categories.

I like Lars approach for the legislation, but I'd prefer to use a European code list. Could you share yours as a start Lars?

laers commented 1 month ago

Sure. I generated it in GeoNetwork and you are hopefully able to import it.

    <rdf:Description rdf:about="">
        <skos:prefLabel xml:lang="en">High-value datasets</skos:prefLabel>
        <skos:scopeNote xml:lang="en">COMMISSION IMPLEMENTING REGULATION (EU) 2023/138 of 21 December 2022 laying down a list of specific high-value datasets and the arrangements for their publication and re-use</skos:scopeNote>
        <skos:prefLabel xml:lang="da">Høj-værdi datasæt</skos:prefLabel>
        <skos:scopeNote xml:lang="da">KOMMISSIONENS GENNEMFØRELSESFORORDNING (EU) 2023/138 af 21. december 2022 om en liste over særlige typer datasæt af høj værdi og ordningerne for deres offentliggørelse og videreanvendelse</skos:scopeNote>
        <skos:inScheme rdf:resource=""/>
    <rdf:Description rdf:about="">
        <rdf:type rdf:resource=""/>
        <skos:prefLabel xml:lang="en">INSPIRE</skos:prefLabel>
        <skos:scopeNote xml:lang="en">DIRECTIVE 2007/2/EC OF THE EUROPEAN PARLIAMENT AND OF THE COUNCIL of 14 March 2007 establishing an Infrastructure for Spatial Information in the European Community (INSPIRE)</skos:scopeNote>
        <skos:prefLabel xml:lang="da">INSPIRE</skos:prefLabel>
        <skos:scopeNote xml:lang="da">EUROPA-PARLAMENTETS OG RÅDETS DIREKTIV 2007/2/EF af 14. marts 2007 om opbygning af en infrastruktur for geografisk information i Det Europæiske Fællesskab (Inspire)</skos:scopeNote>
        <skos:inScheme rdf:resource=""/>
    <rdf:Description rdf:about="">
        <rdf:type rdf:resource=""/>
    <rdf:Description rdf:about="">
        <rdf:type rdf:resource=""/>
        <dc:title xml:lang="en">EU legislation</dc:title>
        <dc:title xml:lang="da">EU legislation</dc:title>
        <dc:description xml:lang="en">Code list of EU Directives that a dataset is in scope of.</dc:description>
        <dc:description xml:lang="da">Kodeliste med EU direktiver som et dataprodukt er omfattet af.</dc:description>
hallinpihlatie commented 1 month ago

Many thanks. I added Finnish and Swedish translations and was able to import it to GN after adding a few lines, mainly namespaces. Here's the RDF-file as a ZIP-file for possible re-use.

hallinpihlatie commented 1 month ago

Back to Germany's question. I'm in favour of the Anchor version and it works fine for metadata in a single language. In multilingual metadata I get a mix of Anchor and CharacterString as shown below. Any hints on how to get rid of this mix in GeoNetwork 3.12?

Paikkatiedot Paikkatiedot Geospatiala data Geospatial High-value dataset categories High-value dataset categories 2023-09-27
hallinpihlatie commented 1 week ago

I just noticed that the High-value dataset categories code list has been updated with the sub-categories in English. What is the recommendation? Is it to mark your metadata with 1) only the sub-category, 2) only the main category 3) with both or 4) is there no recommendation?

oberseri commented 6 days ago

Is there any obligation to indicate the category of the HVD datasets - I cannot find it in the legal text. We are thinking of labelling them as HVD only - preferably by referencing a label "HighValueDataset" in our national registry