datamade / open-divisions

Collection of open municipal jurisdictions.
MIT License
0 stars 0 forks source link

Sources for local municipal divisions #3

Open derekeder opened 9 years ago

derekeder commented 9 years ago

From the research I've done, this repository will take on the long and daunting task of collecting local municipal divisions in the United States that are not provided by the US Census or other federal data source.

We'll go after boundaries in the following order of priority:

  1. city boundaries (may already be available from Census)
  2. city wards
  3. precincts

From what I can tell from the conversation in https://github.com/opencivicdata/ocd-division-ids/issues/125, Census is only aware that these divisions exist, but don't have their geographic boundaries.

For this, we will have to go to the state and county level and find these boundaries where available.

For example, in Minneapolis, GeoJSON files for precincts and wards are available here: http://www.sos.state.mn.us/index.aspx?page=18 (Election Forms, Documents and Maps => Maps and Geographic Data => GeoJSON files)

We'll collect this data here, and assign OCD IDs to the boundaries. Hopefully others will help out, a la https://github.com/openaddresses/

@jpmckinney @jamesturk @paultag sound good to you?

@iandees @joegermuska are my assumptions about US Census not having city/township precinct and ward boundaries correct?

jpmckinney commented 9 years ago

That's what we had to do for the 438 boundary sets we've collected in Canada. (A boundary set is a concept in Represent that matches a shapefile.) Our repo is https://github.com/opennorth/represent-canada-data/ To give a sense of our sources:

The Canadian Census publishes boundaries for regions (census divisions) and municipalities (census subdivisions). Quebec and New Brunswick aggregate their submunicipal boundaries (wards, etc.). We paid a Saskatchewan company that maintains the boundaries for 36 municipalities to deliver their shapefiles. Otherwise, we had to get the boundaries from each individual municipality.

We have scripts to collect the shapefiles that are publicly available online. Otherwise, we need to email GIS departments or city clerks to send us the data. In some cases, we had to submit FOI requests, and in one case we had to appeal a city's decision to get their data released. In one case, a city charged for their data, and there was no way around paying.

Collecting GIS department/city clerk email addresses, emailing them, following up, etc. is a lot of repetitive work, so I would recommend contracting and training someone to do it, which is what we did.

Update: Also, in order to publish our data without breaching copyright, we also request permission from municipalities to redistribute. We have almost always received permission. I recommend asking for permission once they've already sent you the data, though.

With this process, we got from 83% to 92% of all boundaries by population (we don't actually know precisely, because there is no source anywhere for which Manitoba and Ontario municipalities are divided into wards).

By the way, mySociety has a global MapIt instance that sources boundaries from OpenStreetMap. MapZen also repackages OSM boundaries. It's worth checking out how many administrative levels down they go.

I'm leaving out the parts about correcting government data, e.g. merging polygons if they should have been the same feature, correcting typos or all-caps, finding missing projections, converting 3D shapefiles to 2D, filtering out bogus features, etc. There's a lot of that to do, too.

iandees commented 9 years ago

You are correct. Very few of Census' geographies are real, legal boundaries. They especially don't have cities, precincts, or wards. There are CDPs (Census Designated Places), which tend to be pretty darn close to city boundaries, but aren't equivalent.

Be aware that Mapzen is working on collecting more general boundary datasets with their gazetteer: https://mapzen.com/blog/who-s-on-first. Their data is on Github: https://github.com/whosonfirst/whosonfirst-data/

/cc @thisisaaronland