open-austin / construction-permits

:construction: City of Austin Construction Permits
The Unlicense
10 stars 1 forks source link

Addresses need geocoding #3

Closed spatialaustin closed 8 years ago

spatialaustin commented 8 years ago

The permit addresses need to be geocoded - they do not have lat/lon info. There are ~180,000 unique addresses

spatialaustin commented 8 years ago

I'm doing some cleaning and will attempt to match up with the city's address points layer. It looks like they're coming from a free-text field; there's a lot of variation in formatting.

This is for the historical dataset. We should be able to work within API quotas (google, bing, etc) for daily permit pulls.

spatialaustin commented 8 years ago

i've added lat/lon to the permit reports.

now that the backlog has been cut down, a geocoding library could be used to handle daily report dumps. e.g. https://pypi.python.org/pypi/geocoder

spatialaustin commented 8 years ago

the permit-pull + geocoding script is close. i need to relearn how to write a dictreader instance to file and that's about it: https://github.com/open-austin/construction-permits/blob/more-geocoding/dank-eshet.py

there's going to be another backlog of permits to geocode. boo-hoo.

spatialaustin commented 8 years ago

this process needs to be repeated. holding off until geocoding chron job is set up.

davemcphee commented 8 years ago

Things I've learned using geopy:

  1. Use GoogleV3
  2. Use an API key - the free tier sounds great, until you suddenly get the "max API calls reached" error because you're hosting on AWS / Heroku / etc and google counts all free API queries from the same domain as coming from the same account.
  3. Use a locality to constrain searches:
location = geolocator.geocode(address, components={'locality': 'austin'})
  1. GoogleV3 will always return a location, even if you feed it garbage for it's address. Check for the resulting lat / long and consider it failure if they match this - it's the center of austin, the default result for unknown addresses constrained to Austin:

(location.latitude == 30.267153) and (location.longitude == -97.7430608)

Good luck!

spatialaustin commented 8 years ago

all very great tips, thanks!

spatialaustin commented 8 years ago

backlog of permits to 1980 geocoded and committed.

spatialaustin commented 8 years ago

i also renamed all of the pre-chron job cvs to match the YYYY-MM-DD convention. so that's good.