Open PeterParslow opened 3 years ago
The W3C mapping, on which this is largely based, is at https://www.w3.org/2015/spatial/wiki/ISO_19115_-_DCAT_-_Schema.org_mapping
Andrea Perego’s ISO 19139 - DCAT mapping in GitHub (James’ link) provides more detail e.g. the range of each element, and also maps somethings outside the DCAT namespace(s).
https://github.com/GeoCat/iso-19139-to-dcat-ap/blob/master/documentation/Mappings.md
(Thanks to James Reid)
Just been contacted by the CDDO data standards team looking to state how to describe "where" in DCAT metadata to be used in the UK government data marketplace. This will include updating the mapping above for DCAT v3.
See https://github.com/co-cddo/data-catalogue-schemas/issues/1
Should we also adapt the GEMINI mapping to DCAT 3 as this now includes better description of dataset series?
That will be a necessary part of the CDDO work; I'll make sure it is available as an update to this GEMINI change request. It's also being discussed (& likely to happen) in the OGC GeoDCAT SWG.
Need to annotate this to show how it aligns (or not!) with the UK Cross-Government Metadata Exchange Model which may be re-branded as a UK Application Profile of DCAT
@PeterParslow to update table, then @archaeogeek to update elements with equivalent mappings, also publish this table as guidance
We'll also need to include guidance or at least comment on converting GEMINI to DCAT covering how many dcat distributions to create (depending on e.g. GEMINI Use constraints & Resource locators).
Revised table, with extra columns for DCAT v3 & UK government metadata exchange model. Note, the UK Gov work is supposed to consider adding spatial & some other things; they also plan to convert it to a full AP of DCAT v3.
GEMINI element | Condition | Schema.org | DCAT/DCAT2[1] | Notes | DCAT3 | UK Gov MXM |
---|---|---|---|---|---|---|
Title | name | dct:title | Y | Y | ||
Dataset language | inLanguage | dct:language | Y | N | ||
Abstract | description | dct:description | Y | Y | ||
Topic category | keywords | dct:subject | Y | N | ||
Keyword | INSPIRE theme | keywords | dcat:theme / dct:subject | DCAT3 expects theme to be used when the target is a SKOS concept; subject in the more general case, whether or not the term is from a controlled vocab | Y | dcat:theme |
Keyword | free text | keywords | dcat:keyword | Schema.org puts all the ‘free text’ keywords in one value; DCAT / MXM keyword are 'uncontrolled' literals | Y | Y |
Keyword | Controlled list, URL | Keywords.DefinedTerm.name Use .description for the textual content of the Anchor or CodeList Use .url for the target of the Anchor |
dcat:keyword.DefinedTerm | dcat:theme? | dcat:theme | |
Temporal extent | temporalCoverage[2] | dct:temporal | Y | N proposed | ||
Dataset reference date | 19115 dateType = publication | datePublished | dct:issued release date / issued | Y | Y | |
Dataset reference date | 19115 dateType = revised | dateModified | update date / dct:modified | Y | Y | |
Lineage | dct:provenance | Uses PROV | N | |||
Extent | spatialCoverage.Place.name | dct:spatial if available as a link | Y | N proposed | ||
Resource locator.linkage | 19115 function = download | contentURL (inside “distribution”) | dcat:downloadURL | Y | Y | |
Resource locator.linkage | 19115 function = “information” Where the page links on to download |
dcat:accessURL of a dcat:Distribution? | Y | N | ||
Resource locator.linkage | 19115 function = “information” | url | dcat:landingPage | Y | N | |
Data format | encodingFormat | dct:format of a dcat:Distribution | Possibly also dcat:mediaType | Y | N | |
Responsible organisation | 19115 role = publisher | publisher.Organization (with at least name, email, url) | dct:publisher | Y | Y | |
Responsible organisation | 19115 role = pointOfContact | contactPoint (probably Organisation, with at least name, email, url) | dcat:contactPoint | dcat:contactPoint is a vCard | Y | must contain email & contactName (organisation) |
Use constraints | Use constraints is being used to indicate a licence | license | dct:license | license is a property of a distribution | Y | Y licence |
Use constraints | Where GEMINI has an Anchor URL to the licence | licence.CreativeWork .abstract (with the free text) and .url (with the Anchor target URL) |
Y | Y | ||
Use constraints | Other circumstances | dct:accessRights | accessRights is a property of the dataset | Y | Y | |
Bounding box | spatialCoverage.geo.GeoShape.box | dct:spatial.dct:Location.dct:bbox | Note: needs translating from four edges to two corners | Y | N | |
Resource identifier | identifier | dct:identifier | Y | Y | ||
Resource type | rdf:type | cataloguedResource is either Dataset or DataService; Note: DCAT-AP does not distinguish between datasets and dataset series; DCATv3 does | The CataloguedResource can be either Dataset, DatasetSeries, or DataService | Y |
@PeterParslow what do I need to do next? I can't remember...
@PeterParslow what do I need to do next? I can't remember...
See if what I've come up with in a desk exercise matches what you'd expect from the GeoNetwork implementation of DCAT?
@archaeogeek do you have a link to where this transformation is mapped in GeoNetwork 4. It is available (in theory) though the OGC API - Records interface, though links aren't working for us
@nmtoken it's not the mapping. We have it working here: https://spatialdata.gov.scot/geonetwork/api/collections/main/items/fa510351-8e30-4147-b984-862be84a6f90. You need to check the log files- I suspect you're missing the relevant xsl files in https://github.com/geonetwork/geonetwork-microservices/tree/main/modules/services/ogc-api-records/src/main/resources/xslt/ogcapir/formats/copy (which is completely undocumented). Basically you need a gemini one that matches the iso19139 one
Not the headers then (https://github.com/geonetwork/geonetwork-microservices/issues/114) ?
@nmtoken the above is all I had to do to get it working, YMMV.
@archaeogeek Just checking we are not talking at cross purposes, you seem to be saying that in your Tree Preservation Orders - Argyll and Bute example the fact that the schema.org
, dcat
, dcat_turtle
, and geojson
tabs link to content is becuase you have a gemini XSL file and we don't.
For us (for example https://metadata.bgs.ac.uk/geonetwork/api/collections/main/items/a2b1143b-5c5d-23d6-e054-002128a47908) and the EEA geospatial data catalogue (for example https://sdi.eea.europa.eu/catalogue/api/collections/main/items/71c47f78-27b6-4080-acd5-47b306b273d8) these tabs don't give any content (only errors).
We have used W3C’s recommendations for mapping from ISO 19115 to Schema.org. This table summarises the Schema.org equivalence statements given for each element below. Whilst there is no specific DD2 recommendation concerning DCAT, we believe a DCAT2 “equivalent element” for each GEMINI element would be useful, by supporting those whose web publication of GEMINI records uses DCAT as opposed to Schema.org. Where this is easily available from the same W3C source, we have included this below. You will see that the two vocabularies are very similar, but note that: • some of the DCAT elements sit in the DCAT “distribution” section, not their “dataset”; • many DCAT properties have structured content, so this is not a complete list of how to implement it; and • there are many other DCAT properties that should also be used, beyond those that exist in Schema.org (e.g. conformsTo, creator, spatialResolutionInMeters, format).
Use .description for the textual content of the Anchor or CodeList
Use .url for the target of the Anchor
Where the page links on to download
.abstract (with the free text) and .url (with the Anchor target URL)