artsy / bearden

A simple database of organizations
MIT License
3 stars 8 forks source link

Add fields to get Street Address, State, Zip from Geocoder #333

Open madeleineb opened 7 years ago

madeleineb commented 7 years ago

These fields are used to to more efficiently divide work among regionally-specialized GP team members.

Request is to store the following in separate fields like City and Country are already:

cc: @nicholassewitz

nicholassewitz commented 7 years ago

Can we add State to this as well please, if in the US.

N

On Mon, Dec 4, 2017 at 10:44 AM, Madeleine Boucher <notifications@github.com

wrote:

These fields are used to to more efficiently divide work among regionally-specialized GP team members.

Geocoder should return Zip and Street Address, request is to store them in separate fields like City and Country are already.

cc: @nicholassewitz https://github.com/nicholassewitz

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/artsy/bearden/issues/333, or mute the thread https://github.com/notifications/unsubscribe-auth/AKaPgk4N0bWSdGvbGcsLWRWXDSa9B-8Fks5s9BN5gaJpZM4Q02z1 .

madeleineb commented 7 years ago

@nicholassewitz just updated to include state/province! we should get province for anywhere outside of the US, where possible, for consistency.

jonallured commented 7 years ago

I just did a little research on what this would take and it's looking quite simple. There's a section of the geocoder README that details what bits we get back in a response:

https://github.com/alexreisner/geocoder#advanced-geocoding

So, what I'm thinking is that this will end up being a database migration to add columns to the locations table and then it would just be wiring up here in the geocoding job:

https://github.com/artsy/bearden/blob/master/app/jobs/geocode_location_job.rb#L26

Are we wanting this done for only US locations or should we divvy up this info from all geocoder responses?

jonallured commented 7 years ago

Oh, the other thing to consider is what to do about the locations already geocoded - would we want to go back through them all again to update our data?

jonallured commented 7 years ago

Ok, I've done a little more digging here and wanted to share. That geocoder doc I linked above mentioned a list of data points that are always available on a result. I took a US and UK address as examples and ran them like this:

artsy = "401 Broadway, 25th Floor, New York, NY 10013"
b_palace = "Westminster, London SW1A 1AA, UK"
places = [artsy, b_palace]
results = places.map { |address| Geocoder.search(address).first }
attrs = %i[address city coordinates country country_code latitude longitude postal_code state state_code]
print_attrs = ->(result) { attrs.each { |attr| puts "#{attr} => #{result.send(attr)}" } }
results.each { |result| print_attrs.call(result) }

# address => 401 Broadway, New York, NY 10013, USA
# city => New York
# coordinates => [40.7189608, -74.00279379999999]
# country => United States
# country_code => US
# latitude => 40.7189608
# longitude => -74.00279379999999
# postal_code => 10013
# state => New York
# state_code => NY

# address => London SW1A 1AA, UK
# city => Greater London
# coordinates => [51.501009, -0.1415876]
# country => United Kingdom
# country_code => GB
# latitude => 51.501009
# longitude => -0.1415876
# postal_code => SW1A 1AA
# state => England
# state_code => England

One thing you'll notice is that there really isn't anything I'd call a Street Address there. Like, for Artsy, I would have expected just "401 Broadway" in one of those fields, but alas. Then I noticed that ALL data can be found in a data hash:

results.each { |result| puts result.data.to_json }
``` { "address_components": [ { "long_name": "401", "short_name": "401", "types": [ "street_number" ] }, { "long_name": "Broadway", "short_name": "Broadway", "types": [ "route" ] }, { "long_name": "Manhattan", "short_name": "Manhattan", "types": [ "political", "sublocality", "sublocality_level_1" ] }, { "long_name": "New York", "short_name": "New York", "types": [ "locality", "political" ] }, { "long_name": "New York County", "short_name": "New York County", "types": [ "administrative_area_level_2", "political" ] }, { "long_name": "New York", "short_name": "NY", "types": [ "administrative_area_level_1", "political" ] }, { "long_name": "United States", "short_name": "US", "types": [ "country", "political" ] }, { "long_name": "10013", "short_name": "10013", "types": [ "postal_code" ] } ], "formatted_address": "401 Broadway, New York, NY 10013, USA", "geometry": { "bounds": { "northeast": { "lat": 40.7191736, "lng": -74.00250530000001 }, "southwest": { "lat": 40.718748, "lng": -74.0030823 } }, "location": { "lat": 40.7189608, "lng": -74.00279379999999 }, "location_type": "ROOFTOP", "viewport": { "northeast": { "lat": 40.72030978029149, "lng": -74.00144481970851 }, "southwest": { "lat": 40.7176118197085, "lng": -74.00414278029152 } } }, "partial_match": true, "place_id": "ChIJW5D4lI1ZwokRUKDFbb4uTIA", "types": [ "premise" ] } { "address_components": [ { "long_name": "SW1A 1AA", "short_name": "SW1A 1AA", "types": [ "postal_code" ] }, { "long_name": "London", "short_name": "London", "types": [ "postal_town" ] }, { "long_name": "Greater London", "short_name": "Greater London", "types": [ "administrative_area_level_2", "political" ] }, { "long_name": "England", "short_name": "England", "types": [ "administrative_area_level_1", "political" ] }, { "long_name": "United Kingdom", "short_name": "GB", "types": [ "country", "political" ] } ], "formatted_address": "London SW1A 1AA, UK", "geometry": { "bounds": { "northeast": { "lat": 51.5037514, "lng": -0.1397412 }, "southwest": { "lat": 51.4995847, "lng": -0.1488254 } }, "location": { "lat": 51.501009, "lng": -0.1415876 }, "location_type": "APPROXIMATE", "viewport": { "northeast": { "lat": 51.5037514, "lng": -0.1397412 }, "southwest": { "lat": 51.4995847, "lng": -0.1488254 } } }, "place_id": "ChIJ1bidZScFdkgRqR6QyL-kxcA", "types": [ "postal_code" ] } ```

Unless I'm missing something, even in there, Google isn't returning a street address for us. Should I move forward with just adding state and postal code? And I'm still unclear if these additional fields should only apply to US addresses or all of them.

Finally, I forgot that we're storing the result, so rather than having to go back through and re-geocode already done records, I can simply write a rake task to dig into the stored results and fetch out whatever data is already there.