Catalist-LLC / unemployment

This is the source code used to produce DEEP-MAPS estimates of the labor force, as described in the included working paper.
Other
32 stars 6 forks source link

trying to use the model #1

Open idahopotato1 opened 4 years ago

idahopotato1 commented 4 years ago

Thank you for posting the model. it is great. I do have three questions.

  1. in the 03_state_claims_weekly_analysis.R, there is a shell function "04_upload_to_server.sh". I don't see the sh file anywhere. Is it to upload the data to Vertica?
  2. I couldn't find the table structure you created in Vertica. I tried to change it to sqlite but without the table structures, it is difficult to make it work with the following codes. Would you share what the table or tables look like?
  3. There is a twoway2016.rds file mentioned in the code but I don't see it anywhere or see it being created anywhere? would you mind share that as well? Thank you again for the great work. This is very helpful to a project of mine. Thank you
yghitza commented 4 years ago

Sorry slow to respond! Answers:

For #1 and #3, you shouldn't need access to those files. 03_state_claims_weekly_analysis.R is something I used to make a graph comparing unemployment claims by state to 2016 election results. Sorry for the confusion, I'll probably delete that from the repo.

For #2, the only tables I created in Vertica were raw data that I downloaded from IPUMS (as mentioned in the readme). If it's helpful, here are the two table structures. You don't need all of these variables from IPUMS, only the ones that are pulled from the tables (in 01_load_acsocc_data.R and 03a_ipums_version.R if I remember correctly). Anyway, here are the table structures:

The first is publicdata.cps_raw (which holds the CPS microdata):

   Schema   |  Table  |  Column   |     Type      | Size | Default | Not Null | Primary Key | Foreign Key 
------------+---------+-----------+---------------+------+---------+----------+-------------+-------------
 publicdata | cps_raw | year      | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | serial    | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | month     | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | hwtfinl   | float         |    8 |         | f        | f           | 
 publicdata | cps_raw | cpsid     | varchar(4000) | 4000 |         | f        | f           | 
 publicdata | cps_raw | asecflag  | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | region    | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | statefip  | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | metro     | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | metarea   | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | county    | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | cbsasz    | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | metfips   | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | individcc | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | faminc    | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | pernum    | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | wtfinl    | float         |    8 |         | f        | f           | 
 publicdata | cps_raw | cpsidp    | varchar(4000) | 4000 |         | f        | f           | 
 publicdata | cps_raw | age       | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | sex       | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | race      | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | marst     | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | popstat   | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | nchild    | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | eldch     | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | yngch     | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | citizen   | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | hispan    | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | empstat   | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | labforce  | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | occ       | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | occ2010   | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | occ1990   | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | ind1990   | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | occ1950   | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | ind       | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | ind1950   | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | classwkr  | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | uhrsworkt | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | uhrswork1 | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | uhrswork2 | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | ahrsworkt | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | ahrswork1 | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | ahrswork2 | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | absent    | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | durunem2  | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | durunemp  | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | whyunemp  | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | whyabsnt  | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | whyptlwk  | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | wnftlook  | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | wnlook    | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | wkstat    | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | empsame   | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | multjob   | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | numjob    | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | paidemp1  | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | paidemp1n | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | paidemp2  | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | paidemp2n | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | profcert  | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | statecert | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | jobcert   | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | wrkoffer  | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | nilfact   | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | actsame   | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | educ      | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | unionhh   | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | otpay     | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | vowhynot  | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | voynotreg | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | votehow   | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | votewhen  | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | voreghow  | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | voreg95   | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | voteres   | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | voteresp  | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | voted     | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | voreg     | int           |    8 |         | f        | f           | 
 publicdata | cps_raw | vosuppwt  | float         |    8 |         | f        | f           | 

And here is publicdata.longterm_pums_data (which holds the ACS microdata):

   Schema   |       Table        |   Column    |     Type      | Size | Default | Not Null | Primary Key | Foreign Key 
------------+--------------------+-------------+---------------+------+---------+----------+-------------+-------------
 publicdata | longterm_pums_data | year        | int           |    8 |         | f        | f           | 
 publicdata | longterm_pums_data | sample      | int           |    8 |         | f        | f           | 
 publicdata | longterm_pums_data | serial      | int           |    8 |         | f        | f           | 
 publicdata | longterm_pums_data | cbserial    | varchar(4000) | 4000 |         | f        | f           | 
 publicdata | longterm_pums_data | hhwt        | float         |    8 |         | f        | f           | 
 publicdata | longterm_pums_data | cluster     | varchar(4000) | 4000 |         | f        | f           | 
 publicdata | longterm_pums_data | stateicp    | int           |    8 |         | f        | f           | 
 publicdata | longterm_pums_data | statefip    | int           |    8 |         | f        | f           | 
 publicdata | longterm_pums_data | countyicp   | int           |    8 |         | f        | f           | 
 publicdata | longterm_pums_data | countyfip   | int           |    8 |         | f        | f           | 
 publicdata | longterm_pums_data | strata      | varchar(4000) | 4000 |         | f        | f           | 
 publicdata | longterm_pums_data | gq          | int           |    8 |         | f        | f           | 
 publicdata | longterm_pums_data | farm        | int           |    8 |         | f        | f           | 
 publicdata | longterm_pums_data | pernum      | int           |    8 |         | f        | f           | 
 publicdata | longterm_pums_data | perwt       | float         |    8 |         | f        | f           | 
 publicdata | longterm_pums_data | sex         | int           |    8 |         | f        | f           | 
 publicdata | longterm_pums_data | age         | int           |    8 |         | f        | f           | 
 publicdata | longterm_pums_data | marst       | int           |    8 |         | f        | f           | 
 publicdata | longterm_pums_data | birthyr     | int           |    8 |         | f        | f           | 
 publicdata | longterm_pums_data | race        | int           |    8 |         | f        | f           | 
 publicdata | longterm_pums_data | raced       | int           |    8 |         | f        | f           | 
 publicdata | longterm_pums_data | hispan      | int           |    8 |         | f        | f           | 
 publicdata | longterm_pums_data | hispand     | int           |    8 |         | f        | f           | 
 publicdata | longterm_pums_data | citizen     | int           |    8 |         | f        | f           | 
 publicdata | longterm_pums_data | speakeng    | int           |    8 |         | f        | f           | 
 publicdata | longterm_pums_data | hcovany     | int           |    8 |         | f        | f           | 
 publicdata | longterm_pums_data | hcovpriv    | int           |    8 |         | f        | f           | 
 publicdata | longterm_pums_data | hinsemp     | int           |    8 |         | f        | f           | 
 publicdata | longterm_pums_data | hcovpub     | int           |    8 |         | f        | f           | 
 publicdata | longterm_pums_data | hinscaid    | int           |    8 |         | f        | f           | 
 publicdata | longterm_pums_data | hinscare    | int           |    8 |         | f        | f           | 
 publicdata | longterm_pums_data | educ        | int           |    8 |         | f        | f           | 
 publicdata | longterm_pums_data | educd       | int           |    8 |         | f        | f           | 
 publicdata | longterm_pums_data | empstat     | int           |    8 |         | f        | f           | 
 publicdata | longterm_pums_data | empstatd    | int           |    8 |         | f        | f           | 
 publicdata | longterm_pums_data | labforce    | int           |    8 |         | f        | f           | 
 publicdata | longterm_pums_data | occ         | int           |    8 |         | f        | f           | 
 publicdata | longterm_pums_data | occ2010     | int           |    8 |         | f        | f           | 
 publicdata | longterm_pums_data | ind         | int           |    8 |         | f        | f           | 
 publicdata | longterm_pums_data | ind1990     | int           |    8 |         | f        | f           | 
 publicdata | longterm_pums_data | classwkr    | int           |    8 |         | f        | f           | 
 publicdata | longterm_pums_data | classwkrd   | int           |    8 |         | f        | f           | 
 publicdata | longterm_pums_data | inctot      | int           |    8 |         | f        | f           | 
 publicdata | longterm_pums_data | ftotinc     | int           |    8 |         | f        | f           | 
 publicdata | longterm_pums_data | versionhist | int           |    8 |         | f        | f           | 
 publicdata | longterm_pums_data | vetstat     | int           |    8 |         | f        | f           | 
 publicdata | longterm_pums_data | vetstatd    | int           |    8 |         | f        | f           | 
 publicdata | longterm_pums_data | tranwork    | int           |    8 |         | f        | f           | 
 publicdata | longterm_pums_data | trantime    | int           |    8 |         | f        | f           | 
 publicdata | longterm_pums_data | histid      | varchar(4000) | 4000 |         | f        | f           | 
idahopotato1 commented 4 years ago

Thank you so much for your advice. I will give it a shot.

idahopotato1 commented 4 years ago

Hello again It seems that you didn't use any of the time series data in your model. Just the microdata. If so are you just using the time series data for demonstration purposes only?

idahopotato1 commented 4 years ago

Do you have any accuracy score you can share? Thanks

idahopotato1 commented 4 years ago

I ended up using a machine learning approach. I have not run it for all the census tracts yet but for the ones I ran, I got pretty much the same results as yours. This was a very fun project. Thank you for the inspiration.