Closed MattBlissett closed 4 years ago
Please consider that we anticipate GADM will be used a lot (e.g. create a choropleth map with an aggregation code, metrics for country pages aggregated by county) which may influence the ES structure necessary.
GADM will give us this sort of data:
type │ id │ source │ title │ isocountrycode2digit
───────┼─────────────┼──────────────────┼────────┼──────────────────────
GADM0 │ JPN │ http://gadm.org/ │ Japan │ JP
GADM1 │ JPN.26_1 │ http://gadm.org/ │ Nagano │ JP
GADM2 │ JPN.26.40_1 │ http://gadm.org/ │ Nagawa │ JP
The ids are clearly structured. @fmendezh, is there something ES can do with this, or should it be four fields (some countries will have GADM3
)?
Do we want to index the title, or just the id?
I think we need code and title since we'll want to include this in the occurrence JSON response.
I could imagine all these being useful:
"GADM": {
isoCountryCode:JP, // to help spot possible data errors
level0: {
"code": "JPN",
"title": "Japan"
},
level1: {
"code": "JPN.26_1",
"title": "Nagano"
},
level2: {
"code": "JPN.26.40_1",
"title": "Nagawa"
}
}
We will definitely want to be able to search and aggregate counts by code, and I don't know if flattening data for ES will help with that (e.g. holding gadm0Code
and gadm0Title
).
How important is the isoCountryCode
?
GADM uses three letter codes (which we could map to two letters using Country
), but it also includes an additional 7 custom codes:
XAD │ Akrotiri and Dhekelia
XCA │ Caspian Sea
XCL │ Clipperton Island
XKO │ Kosovo (NB we already have XKX from other sources)
XNC │ Northern Cyprus
XPI │ Paracel Islands
XSP │ Spratly Islands
So these won't match up anyway (XCL
would be FR
from our Natural Earth interpretation). I could invent two-letter codes, or leave three-letter codes (but then there's little advantage over the level0
code).
We have mixed-up GADM results on records: http://api.gbif-uat.org/v1/occurrence/1249992702 (contains Midjylland and Wellington).
This blocks deployment to production.
Deployed to PROD
To enable search and analysis by administrative region, add GADM at levels 0, 1, 2 and 3 to occurrences.
This should be an additional field, and not change or use
dwc:stateProvince
etc.Depends on https://github.com/gbif/geocode/issues/6