Closed thatbudakguy closed 5 months ago
Making this a draft again pending discussion of behavior for some fields; see https://github.com/OpenGeoMetadata/metadata-issues/issues/50
The path in lib/geo_combine/geoblacklight.rb:16
changed to "https://raw.githubusercontent.com/OpenGeoMetadata/opengeometadata.github.io/main/docs/schema/geoblacklight-schema-#{GEOBLACKLIGHT_VERSION}.json"
@thatbudakguy any chance we can get solr_geom
to dcat_bbox
added to this? This is otherwise working
@the-codetrane thx for pointing that out; I added a step to handle dcat_bbox
. This PR is now blocked by #162.
@thatbudakguy found another key that could be migrated - layer_geom_type_s
to gbl_resourceType_sm
. The crosswalk documentation has them as deprecated/new fields, but it would appear they are in fact related.
there's code in this PR to do that – we use a lookup table to map geometry types to resources types. it's only straightforward for a few cases, imo. does it not work for you?
This is what comes out when I run the migrator on a GBL 1.0 schema record:
{
"dct_description_sm": [
"This polygon shapefile represents the 1964 County Boundaries for China. The layer includes population census data and was primarily based on the \"Historical Administrative Maps of the People's Republic of China,\" published by China Map Press, and some other yearly administrative maps. See the documentation for more information and a list of the layer variables."
],
"dct_format_s": "Shapefile",
"dct_identifier_sm": [
"http://hdl.handle.net/2451/34626"
],
"dct_language_sm": [
"English"
],
"dct_publisher_sm": [
"Beijing Hua tong ren shi chang xin xi you xian ze ren gong si"
],
"dc_relation_sm": [
"http://sws.geonames.org/1814991/about/rdf"
],
"dct_accessRights_s": "Restricted",
"dct_subject_sm": [
"Boundaries",
"Demographic surveys",
"Population"
],
"dct_title_s": "1964 County Boundaries of China with Population Census Data",
"dc_type_s": "Dataset",
"dct_isPartOf_sm": [
"Historical China County Population Census Data"
],
"dct_issued_s": "2005",
"schema_provider_s": "NYU",
"dct_references_s": "{\"http://schema.org/url\":\"http://hdl.handle.net/2451/34626\",\"http://www.opengis.net/def/serviceType/ogc/wfs\":\"https://maps-restricted.geo.nyu.edu/geoserver/sdr/wfs\",\"http://www.opengis.net/def/serviceType/ogc/wms\":\"https://maps-restricted.geo.nyu.edu/geoserver/sdr/wms\",\"http://schema.org/downloadUrl\":\"https://archive.nyu.edu/retrieve/74851/nyu_2451_34626.zip\",\"http://lccn.loc.gov/sh85035852\":\"https://archive.nyu.edu/retrieve/74896/nyu_2451_34626_doc.zip\"}",
"dct_spatial_sm": [
"People's Republic of China, China"
],
"dct_temporal_sm": [
"1964"
],
"gbl_mdVersion_s": "Aardvark",
"layer_geom_type_s": "Polygon", // I'M GUESSING THIS IS SUPPOSED TO BE SOMETHING ELSE?
"gbl_wxsIdentifier_s": "sdr:nyu_2451_34626",
"gbl_mdModified_dt": "2016-11-10T15:51:38Z",
"id": "nyu-2451-34626",
"nyu_addl_dspace_s": "35559",
"locn_geometry": "ENVELOPE(73.557693, 134.773911, 53.56086, 10.175472)",
"gbl_indexYear_im": [
1964
],
"nyu_addl_format_sm": [
"Shapefile"
],
"_version_": 1779481613907787776,
"timestamp": "2023-10-11T17:38:31.500Z"
}
"layer_geom_type_s": "Polygon", // I'M GUESSING THIS IS SUPPOSED TO BE SOMETHING ELSE?
I would expect "gbl_resourceType_sm": "Polygon Data"
according to the controlled vocab
@the-codetrane can you share the record that you transformed to get that output?
@thatbudakguy My contract at NYU ended, so I'm outside the walled garden. @mnyrop should be able to help you with this.
OK, I found the record. I ran it through the migrator myself and got:
{
"dct_creator_sm": [],
"dct_description_sm": [
"This polygon shapefile represents the 1964 County Boundaries for China. The layer includes population census data and was primarily based on the \"Historical Administrative Maps of the People's Republic of China,\" published by China Map Press, and some other yearly administrative maps. See the documentation for more information and a list of the layer variables."
],
"dct_format_s": "Shapefile",
"dct_identifier_sm": ["http://hdl.handle.net/2451/34626"],
"dct_language_sm": ["English"],
"dct_publisher_sm": [
"Beijing Hua tong ren shi chang xin xi you xian ze ren gong si"
],
"dc_relation_sm": ["http://sws.geonames.org/1814991/about/rdf"],
"dct_accessRights_s": "Restricted",
"dct_subject_sm": ["Boundaries", "Demographic surveys", "Population"],
"dct_title_s": "1964 County Boundaries of China with Population Census Data",
"dct_issued_s": "2005",
"schema_provider_s": "NYU",
"dct_references_s": "{\"http://schema.org/url\":\"http://hdl.handle.net/2451/34626\",\"http://www.opengis.net/def/serviceType/ogc/wfs\":\"https://maps-restricted.geo.nyu.edu/geoserver/sdr/wfs\",\"http://www.opengis.net/def/serviceType/ogc/wms\":\"https://maps-restricted.geo.nyu.edu/geoserver/sdr/wms\",\"http://schema.org/downloadUrl\":\"https://archive.nyu.edu/retrieve/74851/nyu_2451_34626.zip\",\"http://lccn.loc.gov/sh85035852\":\"https://archive.nyu.edu/retrieve/74896/nyu_2451_34626_doc.zip\"}",
"dct_spatial_sm": ["People's Republic of China, China"],
"dct_temporal_sm": ["1964"],
"gbl_mdVersion_s": "Aardvark",
"gbl_wxsIdentifier_s": "sdr:nyu_2451_34626",
"gbl_mdModified_dt": "2016-11-10T15:51:38Z",
"id": "nyu-2451-34626",
"nyu_addl_dspace_s": "35559",
"dcat_bbox": "ENVELOPE(73.557693, 134.773911, 53.56086, 10.175472)",
"gbl_indexYear_im": [1964],
"gbl_resourceClass_s": ["Datasets"],
"gbl_resourceType_s": ["Polygon data"]
}
It turned out there was just a typo; the new field is gbl_resourceType_sm
(not gbl_resourceType_s
), as it's multi-valued. Otherwise, the conversion works as expected (it outputs Polygon data
and the original field is stripped).
I've corrected the mistake.
Resource Class is also multivalued: gbl_resourceClass_sm
Closes #121