usds / justice40-tool

A tool to identify disadvantaged communities due to environmental, socioeconomic and health burdens
https://screeningtool.geoplatform.gov/
Creative Commons Zero v1.0 Universal
132 stars 42 forks source link

Add documentation on how to combine our data with other data sets in Python #1790

Open switzersc-usds opened 2 years ago

switzersc-usds commented 2 years ago

Is your feature request related to a problem? Please describe. We know that people want to build on top of the CEJST definition of disadvantaged communities: for example, agencies who want to add more nuance based on their program focuses, nonprofits or community groups who have more localized data they want to add/use, or states or other governments who want to add their own open data sets to hone in on their jurisdiction. Right now folks have to figure that out on their own, but since it's such a common use case, we should add some documentation to help them get started and not have to reinvent the wheel every time.

Describe the solution you'd like As a data user, I should be able to go to For Developers and Data Scientists section of the main README and see a sub-section on how to combine CEJST data with my data. This should link to a page with info on how to find and use both the downloadable CSV available from the CEJST website AND the big CSV with all of the indicators and data for every tract (since this is quite helpful for data scientists in a multitude of use cases).

I think at a base level we should have steps for using and combing in Python. I'll add new issues for the same info in R and using our tile API to combine map data.

Steps probably look something like this:

  1. Make sure R and any good packages are installed
  2. Create R file for working
  3. Load CEJST data in R from website or URL
  4. Load your data from whatever source
  5. If your data is at tract level, combine based on census tract ID
  6. If your data is at another geographic resolution, figure out cross walk. You can use Geocorr to help: https://mcdc.missouri.edu/applications/geocorr2014.html -- having the steps here for how to use this in this process would be great!
  7. Combine!

Describe alternatives you've considered

Additional context Add any other context or screenshots about the feature request here.

rfgreene commented 2 years ago

I'm currently working on this on a fork. I'm open to working with other contributors on this, too! Just wanted to leave a comment to make sure no one has to reinvent the wheel 😄

Edit: Changed repository name/link to make it compliant with project guidelines.