diegogentilepassaro / min_wage_rent

GNU General Public License v3.0
0 stars 0 forks source link

Clean `acs_rents` #276

Closed santiagohermo closed 3 weeks ago

santiagohermo commented 1 month ago

In this issue we want to clean the acs_rents data that was collected in #275.

The goal is to create base/acs_rents_clean with all the variables in a zcta-year panel.

santiagohermo commented 1 month ago

Completed with exception of the variable "rent as share of income" @diegogentilepassaro. I'll go to PR in a bit, but I first wanted to explain the two issues with this variable.

Issues with B25070

1. Data gives counts per bin

The data in B25070_gross_rent_as_a_percentage_of_household_income_block_group looks like this: image

Metadata ```csv "Column Name","Label" "GEO_ID","Geography" "NAME","Geographic Area Name" "B25070_001E","Estimate!!Total" "B25070_001M","Margin of Error!!Total" "B25070_002E","Estimate!!Total!!Less than 10.0 percent" "B25070_002M","Margin of Error!!Total!!Less than 10.0 percent" "B25070_003E","Estimate!!Total!!10.0 to 14.9 percent" "B25070_003M","Margin of Error!!Total!!10.0 to 14.9 percent" "B25070_004E","Estimate!!Total!!15.0 to 19.9 percent" "B25070_004M","Margin of Error!!Total!!15.0 to 19.9 percent" "B25070_005E","Estimate!!Total!!20.0 to 24.9 percent" "B25070_005M","Margin of Error!!Total!!20.0 to 24.9 percent" "B25070_006E","Estimate!!Total!!25.0 to 29.9 percent" "B25070_006M","Margin of Error!!Total!!25.0 to 29.9 percent" "B25070_007E","Estimate!!Total!!30.0 to 34.9 percent" "B25070_007M","Margin of Error!!Total!!30.0 to 34.9 percent" "B25070_008E","Estimate!!Total!!35.0 to 39.9 percent" "B25070_008M","Margin of Error!!Total!!35.0 to 39.9 percent" "B25070_009E","Estimate!!Total!!40.0 to 49.9 percent" "B25070_009M","Margin of Error!!Total!!40.0 to 49.9 percent" "B25070_010E","Estimate!!Total!!50.0 percent or more" "B25070_010M","Margin of Error!!Total!!50.0 percent or more" "B25070_011E","Estimate!!Total!!Not computed" "B25070_011M","Margin of Error!!Total!!Not computed" ```

We observe a "total" number of renters (I presume) and then the number of people in different bins of the share.

We could obtain a weighted average share of household income spent in housing by assuming that everyone in the bin has a share equal to the upper bound or something.

2. Data are at the block group level

This issue is easier, but requires a bit of work. We would need to aggregate the data at the zcta level using our beloved census_block_master.

santiagohermo commented 3 weeks ago

Continues in #280