ua-snap / epa-justice

US Census and CDC data access via API
MIT License
0 stars 0 forks source link

First production run #3

Closed Joshdpaul closed 4 months ago

Joshdpaul commented 4 months ago

This PR is the first draft of a production-ready routine to fetch and format data for the EPA-Justice-HIA project.

The majority of this code is parsing entries in the NCRPlaces_Census_{MMDDYYYY}.csv table into requests to the various APIs. There is a quite a bit of string chopping and table wrangling to accomplish the fetching and merging of all this data into a single output. I realize that this kind of code can be difficult to review without seeing interim outputs, so to expose some of those interim pieces, I called a few of the individual data fetching functions in fetch_data_and_export.ipynb and included some URL request printing to allow you to see the API requests, the JSON returned, and compare them with the reformatted function outputs.

TO TEST:

Joshdpaul commented 4 months ago

Thanks for the detailed review here, Craig.

Good catch on the duplicated comments! I fixed that bug via 1ebd680 and a93ec66. The rows affected were places having simple one-to-one relationships with tract geography. These were missed by the conditional statements and so inherited the comment from the previous iteration (previous row). Should be good to go now.

I also re-exported the environment.yml as per your suggestion - please give that a shot and see what happens.