wmgeolab / geoBoundaries

geoBoundaries : A Political Administrative Boundaries Dataset (www.geoboundaries.org)
http://www.geoboundaries.org
Other
283 stars 50 forks source link

Contested Border Updates for 4.0 #12

Open DanRunfola opened 4 years ago

DanRunfola commented 4 years ago

Alright - as we move towards a more structured set of data products, the way we engage with contested borders is going to mean we need to update some of our data products. Just to put this in one spot:

1) Our core high-precision single-country release will always reflect (to the best of our ability) each countries understanding of their boundaries. I.e., we want our Israel boundary to reflect what Israel thinks Israel is in this product.

2) In our single-country global standardized products, we'll be clipping to internationally recognized boundaries (starting with US Department of State, and adding the UN later on). In these products, boundaries will be reflective of the intersection between "what the UN says Israel's boundaries are, and what Israel says they are".

3) In our globally-contigious simplified, standardized product, we'll be first growing borders out for each country, then clipping to a nationally recognized standard (i.e., the UN). So, borders will be reflective of "What Israel says Israel is", + any land that the UN says they have, - any land the UN says they don't have (or US Department of State, depending on the baseline used).

With those notes, we will be working towards the following for 4.0: -> Israel needs to be updated to reflect Israel + Palestine, rather than only Israel; this will not impact the Palestinian border in our stand-alone country datasets.

-> Other issues will be noted as they emerge here.

DanRunfola commented 4 years ago

Some further updates on this: 1) Right now, contested areas that don't have an official policy designation (i.e., West Sahara in the US Department of State LISB) by the standardizing country are being assigned based on population - i.e., we simply merge the contested area into the country that lays claim to it based on who has the largest population. This is, obviously, arbitrary.

2) Simply keeping disputed areas as "disputed" isn't an option in the CGAZ case. Take Western Sahara as the example - if we flag it as disputed, that means that Morocco's ADM0/1/2 would all get deleted starting at the dispute line (which, of course, Morocco would take issue with!).

I am considering a few different solutions here - again, this would ONLY impact CGAZ: 1) In the near-term (Versions 4, and probably 5) retain the current 'solution' of merging areas into the region with the largest population. This would become our default product. 2) In the medium-term (Version ~6), provide a simple interface that allows users to make a choice when it comes to contested boundaries, and generate a CGAZ release on-the-fly that reflects what they need. Defaults for this tool would still be population-based. When a user goes to download CGAZ through the interface, they would only see these options in a hidden advanced toggle.

I think (2) above would solve most issues, and pre-generating all possible permutations is (I think) feasible to do, though computationally expensive.

DanRunfola commented 3 years ago

Alright, and even more discussion on this topic off of github I want to port here.

1) It was suggested that - in addition to whatever assumptions we make - we produce a shapefile representing all areas we consider contested. This would allow users to understand where assumptions are being made, and even change those assumptions if desired.

2) Another interesting thought was a dynamic system with prebuilt defaults that advanced users could use. In this system, each polygon that is contested would have a comma-delimited list of neighboring (or not) polygons they could be merged into. We would select defaults based on some arbitrary scheme, but then expose any permutation of those settings to end-users. This would be computationally expensive, but not that expensive, to pre-process all possible permutations.

DanRunfola commented 3 years ago

Just to further update on this, to keep the conversation in one spot, we have finally arrived at something representing a solution to this. In our current iteration of our master ISO tracking table, we now have two new columns: claimants and disputed.

https://github.com/wmgeolab/geoBoundaryBot/blob/main/dta/iso_3166_1_alpha_3.csv

Each row in this database would cover a unique geographic area; in the case of a conflict without an ISO code, we'll generate a temporary one if necessary. The default "global" view will have every single row represented, along with the claimant and disputed columns in the metadata. We will then build a tool that allows individuals to aggregate regions as they desire, though this will come later.

Claimants will be pipe-delimited.

Keep in mind, the country products (i.e., Serbia) will still represents "what Serbia thinks of Serbia", so would include Kosovo. Only in the global cases will we be disambiguating using this approach.

maxmalynowsky commented 3 years ago

I've recently been working with the US Department of State LSIB dataset. While looking for a polygon version of this data, I noticed that the most recent version was created in 2017 hosted here: https://data.humdata.org/dataset/global-lsib-polygons-detailed. Since then, there have been refinements and changes made to the original line layer as recently as May 2021, but no new updated polygons since the 2017 version.

I felt like the CGAZ product could benefit from using the latest line data available, so I created a data management tool to help facilitate an automated conversion to polygons, using a superset of the latest LSIB lines: https://github.com/fieldmaps/adm0-template.

Most attribute values for this layer are derived from the UN M49 standard, with some additional columns added. One of these columns (iso_grp) attempts to document de facto administration of external territories and disputed areas. I haven't included claimant or disputed metadata, although disputed areas are somewhat identified with an iso3 code starting with X. I'd be open to further developing this tool to accommodate the solutions you described above.

As an additional suggestion for ways to implement different outputs, Mapbox produces boundary layers using 4 separate world views (US / de facto, China, India, and Japan), with a preview of this here: https://demos.mapbox.com/vt_polygons/. Taking this worldview output approach may make implementing some of the disputed area merging a bit more technically feasible, having separate merging attribute values for each worldview. For example, in December 2020 the US government recognized Morocco's sovereignty over the territory of Western Sahara, and so that worldview would see those areas merged, while others might not.