whosonfirst / py-mapzen-whosonfirst-spatial

Python library for working with spatial databases (and services) and Who's On First documents.
BSD 3-Clause "New" or "Revised" License
1 stars 1 forks source link

Trouble PIPing constituency record 1108767273 #9

Open dphiffer opened 6 years ago

dphiffer commented 6 years ago

Here's a simple test that demonstrates an issue I've encountered trying to PIP constituency record 1108767273:

#!/usr/bin/env python

import mapzen.whosonfirst.utils
import mapzen.whosonfirst.uri
import mapzen.whosonfirst.geojson
import mapzen.whosonfirst.hierarchy
import mapzen.whosonfirst.spatial.whosonfirst

api_key = "mapzen-xxxxxx" # (replace me)
client = mapzen.whosonfirst.spatial.whosonfirst.api(api_key=api_key)
ancs = mapzen.whosonfirst.hierarchy.ancestors(spatial_client=client)

# locality
nyc_id = 85977539
nyc = mapzen.whosonfirst.utils.load('/usr/local/data/whosonfirst-data/data/', nyc_id)

ancs.rebuild_feature(nyc)
print "NYC hierarchy:"
print nyc["properties"]["wof:hierarchy"]

# Michigan's 6th State House District
mi6_id = 1108767273
mi6 = mapzen.whosonfirst.utils.load('/usr/local/data/whosonfirst-data-constituency-us/data/', mi6_id)

ancs.rebuild_feature(mi6)
print "MI-6 hierarchy:"
print mi6["properties"]["wof:hierarchy"]

Output:

NYC hierarchy:
[{u'region_id': 85688543, u'continent_id': 102191575, u'localadmin_id': 404521211, u'country_id': 85633793, u'locality_id': 85977539, u'county_id': 102082361}]
MI-6 hierarchy:
[{u'constituency_id': 1108767273}]
dphiffer commented 6 years ago

Note that I manually adjusted the reversegeo centroid to fall within the US border:

"reversegeo:latitude":42.331139,
"reversegeo:longitude":-83.058014,
thisisaaronland commented 6 years ago

cc @stepps00 and @bcamper and @migurski since this seems relevant, directly or indirectly, to their interests.

So, I think there are two issues here:

  1. Constituencies don't have any list of potential parents because they are fiddly and complicated and we just never got around to sorting that out - https://github.com/whosonfirst/whosonfirst-placetypes/blob/master/placetypes/constituency.json

  2. This is triggering a logic "error" in py-mz-wof-hierarchy because in order to ensure a (common) hierarchy a place needs to know who its ancestors are (so we don't assign a neighbourhood as part of a county's hierarchy, for example) which isn't really possible unless we have a list of possible parents:

https://github.com/whosonfirst/py-mapzen-whosonfirst-hierarchy/blob/master/mapzen/whosonfirst/hierarchy/__init__.py#L339-L397

I suppose we could either:

  1. Simply say that constituencies are parented by regions (or higher) and move on... but I am still working on the first cup of coffee so I am not sure what the follow on effects of that will be
  2. Something else that I've managed to forget already so presumably it wasn't as clever as I imagined 30 seconds ago...

Discuss!

dphiffer commented 6 years ago

I think option 1 is a good starting point. If we wanted to make the logic more clever (which... no?) I would say it could change depending the value of wof:association.

thisisaaronland commented 6 years ago

I suppose the commonality across all constituencies is country at least until the United Nations takes over.

In the meantime making country the default parent seems like a reasonable middle-ground for a first-pass.

The second-pass would be to define the ways that parent can be redefined by association. So that would mean:

  1. Update constituency.json to be "wof:parent": [ "country" ]
  2. Figure out where to define the overrides - maybe something like "ASSOCIATION:parent": [ "..." ] in constituency.json
  3. Update the tools to check whether a record has a wof:association property and then look up parentage accordingly

Did we ever decide on wof:association as the Boring Name to distinguish one constituency from another? @stepps00 ?

Discuss!

migurski commented 6 years ago

Constituency parents are going to be tricky, because they’re legally defined in different places. Congressional district counts are determined by the US Census, but the district borders are determined by states. Lower-level city or county districts are determined there. It seems like a concept that can float freely throughout the hierarchy.

stepps00 commented 6 years ago

The idea was to use wof:association, wof:association_type, and wof:association_era properties to represent parentage for constituency records.

wof:association - This would be country-specific, and represent what constituency branch the record is a part of. The US House of Representatives, for example, would be "us-house".

wof:association_type - This would represent the type of association, bicameral, unicameral, etc.

wof:association_era - This would represent the formal name of that association. An example would be "115th Congress".

These properties were added in as a "first pass" in this PR, but the details were never fully hammered out. This still doesn't solve the issue of figuring out the hierarchy for these records, though.

I think a good first step would be to update the US constituency records to include the region parent, since that seems to be an easy enough task, then update the hierarchies with additional placetypes once that's complete.

Tagging @burritojustice because he created a nice, detailed flowchart for various constituency records at some point...

stepps00 commented 6 years ago

Also relevant: https://github.com/whosonfirst-data/whosonfirst-data-constituency-us/issues/5 and https://github.com/whosonfirst-data/whosonfirst-data-constituency-us/issues/4.

thisisaaronland commented 6 years ago

Given all of the above:

  1. Does anyone see a reason not to at the very least assign country as the minimal viable parent for a constituency?
  2. Is it safe to extend that list to be (in order of precedence) [ "region", "country" ] ?