Open eucalyptustree opened 11 years ago
Ah yeah I just noticed that markers for pantries such as "Sacred Heart Tree of Life Pantry", "Salvation Army Attleboro Food Pantry", "St. Paul R.C. Church Dollar-A-Bag", and "Soup Kitchen In Provincetown, Inc." are showing up in the ocean, but others look good.
Not convinced that geocoding is correct - for example, clicking on the following shows me a map location in Methuen.
ABC - People's Baptist Church Food Pantry 134 Camden St. Roxbury, MA 2118
Yep, some locations are definitely questionable.
Though one of the Provincetown locations, and a Nantucket location, seem legitimate. I'm thinking maybe we want to try another source, instead of testing the results individually?
Here's an idea for getting the geolocation results from the Google Maps API, as I think that was accurate... remove all the location data, and rig the search to only return results that don't have a lat/lng. Use Google Maps API for geolocation as we were doing before; it'll give some results, and then fail on a bunch of others due to rate limiting. Copy the results it gives back into the db, and repeat until everything has a value. Then reset the search and remove Google Maps geolocation. Does that sound good?
Well, presumably new locations that are added should not require manual geocoding.
1.) Clear all existing geocoding
2.) Some route on the application will trigger a query that returns pantries without a loc. Something like {'loc':{$exists:false}}. We can limit the query to the google maps rate limit to prevent it from making a gazillion useless requests.
3.) For each in the result, it sends a geo-code request to the Google Maps API and stores the result.
On a "going forward" basis the application would only have to ping the API with new pantries. "Catching up" can just happen over time, with the application geocoding existing locations until the API rate limits, rinse and repeat until done.
Once we have an actual "add new pantry" workflow, we would just attach the geo-code request to the act of submission. If we ever get so many pantry's being added per hour by users that we hit the rate limit..., well, we'll have other problems.
The current geocoding should not be relied on, sorry about that. I'm working on cleaning it up now. If others are interested in working on data issues, get in touch with me and I'll point you to what I've done so far.
see also /files/geocoding-inprogress.xlsx and /files/geocoding-readme.txt
Should explain more, sorry. The current geocoding was done sloppily and without too-close checking. The batch geocoder I used spat out more results than we have data for (450 geocoded addresses, vs 325 rows), so I did a vlookup to match real rows with the geocoded output. I'm going back through now to fix them up, but am first doing a visual inspection of the data we have (and finding duplicates, cleaning up spelling errors, sorting out PO Box issues etc).
Our server-side search component now involves geocoding based on the search string. We could certainly write a script to go through all of the records and do a location lookup again.
As @JBaldachino mentioned, we're going to have to bake this into our "add a pantry to the app" workflow. It'd be super-simple to have a fail routine that just calls the lookup function again after a few seconds.
Redid most of the geocoding using google's geocoder. We no longer have pantries in the ocean or South America (there's still a PO box that thinks it's in Spain though.) New file is pantries.geocoded.csv. If you look at files/log you should see which of them failed to re-geocode and why.
Verify that geocoded addresses (in /files/geocode-inprogress.xlsx) and fix any errors.