Police-Data-Accessibility-Project / data-sources-app

An API and UI for using and maintaining the Data Sources database
MIT License
4 stars 5 forks source link

lat/lng updating script for agencies #246

Open josh-chamberlain opened 7 months ago

josh-chamberlain commented 7 months ago

Context

related to https://github.com/Police-Data-Accessibility-Project/data-source-map/issues/9#issuecomment-2059638208

Basically, we need a lat/lng to show an agency on the map—fetching it each time gets expensive, because we have ~20k agencies, but they don't change much so we should just update them in a batch then trickle to fetch the rest.

Requirements

This script should use the mapbox geocoding API. we have a MAPBOX_GEOCODING org secret for github actions—DM if you need one for testing. https://docs.mapbox.com/api/search/geocoding/

  1. create a script to generate lat/lng for agencies where they are blank

    • GitHub Actions; let's set it up to fetch the blank ones once, and then maybe iterate through them every once in a while to update them
    • We can try using one of those APIs to search by ZIP code or agency name, always using state and county to make the search more precise and disambiguate similar place names
    • This will not work for some agencies, but we do not need these to be super precise. generating a lat/lng centered on the municipality/county/state is fine
      • jurisdiction_type will be helpful for determining county state aggregated
  2. update our database via API endpoint

    • may need to create or modify the endpoint

When we run it the first time, we'll need to make a lot of calls for missing lat/lng—but we don't add agencies very often any more.

Docs

mbodeantor commented 7 months ago

@josh-chamberlain Not sure it matters too much which service we use, but since we have stuff on a google account already, might make billing slightly easier to go with that. I would just need an API key once you activate one of those.

mbodeantor commented 7 months ago

@josh-chamberlain if we're not worried about being super accurate, most of those agencies missing lat/lng (3475 of 5777) have a county name that we could use to assign lat/lng

josh-chamberlain commented 7 months ago

we do have mapbox too for the work Joshua G is doing!

mbodeantor commented 7 months ago

@josh-chamberlain using lat/lng for zip codes and counties for the agencies that have them dropped the number of agencies without lat/lng to 1873

Once we have an API key for a geocoding service I can fill the rest

josh-chamberlain commented 7 months ago

@mbodeantor I DM'd you one from mapbox

mbodeantor commented 7 months ago

@josh-chamberlain Here's a sample of the remaining agencies without lat/lng run through the search and then geocode mapbox endpoints. Already seems like some assumptions we could make around school district police departments for example but we might want to have manual review for the bulk.

agencies_with_lat_lng.csv

josh-chamberlain commented 7 months ago

@mbodeantor nice, looks good! falling back to something general is OK too.

josh-chamberlain commented 6 months ago

moving this back to TODO—Marty's no longer able to work on this but I added the API key as a variable, ping me if you need it or have any questions.

maxachis commented 6 days ago

@josh-chamberlain This might be useful to have as another automation job, potentially. We already have the endpoint for updating agencies, so it'd be a matter of retrieving agencies with missing lat/lng and then submitting them.

I don't know how Marty did this, but we'd probably want to make sure the logic is within our repository, so others can follow up on it.

Additionally, we'll want to make note of which agencies we've previously tried to find geocoding information for but failed. I don't know how common that will be, since we require locational information, but it's worth taking into account.

josh-chamberlain commented 5 days ago

@maxachis yes, this would be a good automation! A small one. We have mapbox credentials already, so it should be simple to make a new little script!