wmgeolab / geoBoundaries

geoBoundaries : A Political Administrative Boundaries Dataset (www.geoboundaries.org)
http://www.geoboundaries.org
Other
284 stars 51 forks source link

[BOUNDARY ERRATA] missing shapeName field for BHS ADM0 #2288

Closed slaperche-zenly closed 1 year ago

slaperche-zenly commented 2 years ago

Describe the Error

There is no field shapeName for BHS at level 0 (level 1 is fine), instead the field is named SOVEREIGNT

Screenshots

Info for geoBoundaries-BHS-ADM0.dbf
6 Columns,  1 Records in file
     SOVEREIGNT          string  (27,0)
          Level          string  (4,0)
       shapeISO          string  (3,0)
        shapeID          string  (26,0)
     shapeGroup          string  (3,0)
      shapeType          string  (4,0)
slaperche-zenly commented 2 years ago

I've found this kind of issue in other files but there have already been reported:

DanRunfola commented 2 years ago

This is also messing with our CGAZ builds, so on the docket for correction before 5.0. We have some very simplistic code that tries to catch these, but it's not nearly flexible enough today.

One broad question that keeps coming up is if we should retain fields that don't match our schema - today, I think the answer is yes, for precisely this reason (i.e., you can find these errata), but once we're a bit more stable on the name front I think it may be important to strip out excess columns...

mdsumner commented 1 year ago

I just found this, I'm doing batch reads of all the codes from gbOpen into R.

The underlying GDAL command is here, but I was just reading the fields and trying to bind them - I can scan all the layers just by info and summarize when there's inconsistencies if that helps?

export ISO=BHS
ogrinfo -ro -so /vsizip//vsicurl/https://github.com/wmgeolab/geoBoundaries/raw/main/releaseData/gbOpen/${ISO}/ADM0/geoBoundaries-${ISO}-ADM0-all.zip geoBoundaries-${ISO}-ADM0

INFO: Open of `/vsizip//vsicurl/https://github.com/wmgeolab/geoBoundaries/raw/main/releaseData/gbOpen/BHS/ADM0/geoBoundaries-BHS-ADM0-all.zip'
      using driver `ESRI Shapefile' successful.

Layer name: geoBoundaries-BHS-ADM0
Metadata:
  DBF_DATE_LAST_UPDATE=2023-01-23
Geometry: Polygon
Feature Count: 1
Extent: (-79.594350, 20.912399) - (-72.746165, 26.928412)
Layer SRS WKT:
GEOGCRS["WGS 84",
... <snip>

SOVEREIGNT: String (27.0)
Level: String (4.0)
ISO_Code: String (3.0)
export ISO=AUS
ogrinfo -ro -so /vsizip//vsicurl/https://github.com/wmgeolab/geoBoundaries/raw/main/releaseData/gbOpen/${ISO}/ADM0/geoBoundaries-${ISO}-ADM0-all.zip geoBoundaries-${ISO}-ADM0

INFO: Open of `/vsizip//vsicurl/https://github.com/wmgeolab/geoBoundaries/raw/main/releaseData/gbOpen/AUS/ADM0/geoBoundaries-AUS-ADM0-all.zip'
      using driver `ESRI Shapefile' successful.

Layer name: geoBoundaries-AUS-ADM0
Metadata:
  DBF_DATE_LAST_UPDATE=2023-01-23
Geometry: Polygon
Feature Count: 1
Extent: (96.816952, -43.740497) - (167.998039, -9.142163)
Layer SRS WKT:
GEOGCRS["WGS 84",
...<snip>

shapeName: String (9.0)
shapeISO: String (3.0)
shapeID: String (23.0)
shapeGroup: String (3.0)
shapeType: String (4.0)
DanRunfola commented 1 year ago

Just to finally follow up on this, we have manually gone in and updated layers as is appropriate in this case. See PRs: https://github.com/wmgeolab/geoBoundaries/pull/2818 <- NPL ADM2 https://github.com/wmgeolab/geoBoundaries/pull/2815 <- SLE ADM2 https://github.com/wmgeolab/geoBoundaries/pull/2813 <- IND ADM0

AFG and AGO were both resolved by adding new datasets into the database since this issue was created.

I hope to have these PRs accepted today, and then integrated into the database on the next build.