Esri / geoportal-server

Geoportal Server is a standards-based, open source product that enables discovery and use of geospatial resources including data and services.
https://gptogc.esri.com/geoportal
Apache License 2.0
245 stars 149 forks source link

Project Open Data DCAT Distribution Fields dropped on harvest #257

Open torrin47 opened 7 years ago

torrin47 commented 7 years ago

The Project Open Data schema defines a whole list of elements that constitute a single Distribution section, including Title, Description, downloadURL/accessURL, format, mediaType, describedBy, describedByType, and conformsTo, but when GeoPortal Server harvests a dcat.json file, only downloadURL/accessURL and mediaType appear to be recognized, the other fields are dropped. Then, when producing the dcat output, the mediaType stored by GeoPortal Server isn't honored, and a default value produced. Here's an example file we're harvesting: https://pasteur.epa.gov/metadata.json An example of what one record with many distribution elements looks like when stored in GeoPortal Server XML: https://edg.epa.gov/metadata/rest/document?id=%7B6C145F33-CE69-44FA-95E8-9CA320930EB7%7D and what the DCAT output from GeoPortal Server is: https://edg.epa.gov/metadata/rest/find/document?f=dcat&searchText=fileIdentifier%3AA-p2p0-450

We'd also love for the Details page to be able to display all of those Distribution Fields: https://edg.epa.gov/metadata/catalog/search/resource/details.page?uuid=%7B6C145F33-CE69-44FA-95E8-9CA320930EB7%7D but that might be regarded as a separate issue.

mhogeweg commented 7 years ago

When we harvest DCAT into Geoportal 1.2.x, we generate a Dublin Core record for it (since many DCAT files we've seen have limited information). This will require updating our Dublin Core metadata definition to include these elements. Then we can update the mapping from the new Dublin Core definition to DCAT.