diegogentilepassaro / min_wage_rent

GNU General Public License v3.0
0 stars 0 forks source link

Collect ZIP code business patterns data #251

Closed santiagohermo closed 1 year ago

santiagohermo commented 1 year ago

It is highly likely that, as part of the review, we will use data from ZIP code business patterns. In this issue we will create a script to download the data.

Steps:

  1. Understand the procedure to download the data (it looks like for 2019 onwards we need to use the County business patterns API)
  2. Create a new step in base/zipcode_biz_patterns that hosts a python script to get the data
santiagohermo commented 1 year ago

If anyone has time we can quickly do this. FYI @diegogentilepassaro @gabrieleborg

diegogentilepassaro commented 1 year ago

Taking care of this one @gabrieleborg @santiagohermo!

diegogentilepassaro commented 1 year ago

Just added drive/raw_data/zip_biz_patterns to the Gdrive and all the ZIP level files from 2009 to 2020 live there now. There are two types of files:

  1. ZIP totals contain for each ZIP code: employee counts, total payroll, and establishment counts measured at mid-march each year. Also, it contains noise flags for the different metrics. Screenshot 2023-04-10 at 8 03 22 AM
  2. ZIP detail files: contains establishment counts (total and by size bin) fir each ZIP-industry (as given by NAICS 6) Screenshot 2023-04-10 at 8 04 35 AM
santiagohermo commented 1 year ago

Thanks @diegogentilepassaro! Two questions:

  1. You didn't need a python script to get the data? Cool. Can I ask, how did you download it?
  2. I don't see the data on the GDrive folder. Maybe it didn't sync?

Do you think that we should definitely add any of this to our data build for the revision? If so, we can go ahead and create a base step to pick the variables we want. What do you think?

diegogentilepassaro commented 1 year ago

In this issue, we added the zip code business patterns data and created clean base files. Continues in #258!