Open idahopotato1 opened 4 years ago
Sorry slow to respond! Answers:
For #1 and #3, you shouldn't need access to those files. 03_state_claims_weekly_analysis.R is something I used to make a graph comparing unemployment claims by state to 2016 election results. Sorry for the confusion, I'll probably delete that from the repo.
For #2, the only tables I created in Vertica were raw data that I downloaded from IPUMS (as mentioned in the readme). If it's helpful, here are the two table structures. You don't need all of these variables from IPUMS, only the ones that are pulled from the tables (in 01_load_acsocc_data.R and 03a_ipums_version.R if I remember correctly). Anyway, here are the table structures:
The first is publicdata.cps_raw (which holds the CPS microdata):
Schema | Table | Column | Type | Size | Default | Not Null | Primary Key | Foreign Key
------------+---------+-----------+---------------+------+---------+----------+-------------+-------------
publicdata | cps_raw | year | int | 8 | | f | f |
publicdata | cps_raw | serial | int | 8 | | f | f |
publicdata | cps_raw | month | int | 8 | | f | f |
publicdata | cps_raw | hwtfinl | float | 8 | | f | f |
publicdata | cps_raw | cpsid | varchar(4000) | 4000 | | f | f |
publicdata | cps_raw | asecflag | int | 8 | | f | f |
publicdata | cps_raw | region | int | 8 | | f | f |
publicdata | cps_raw | statefip | int | 8 | | f | f |
publicdata | cps_raw | metro | int | 8 | | f | f |
publicdata | cps_raw | metarea | int | 8 | | f | f |
publicdata | cps_raw | county | int | 8 | | f | f |
publicdata | cps_raw | cbsasz | int | 8 | | f | f |
publicdata | cps_raw | metfips | int | 8 | | f | f |
publicdata | cps_raw | individcc | int | 8 | | f | f |
publicdata | cps_raw | faminc | int | 8 | | f | f |
publicdata | cps_raw | pernum | int | 8 | | f | f |
publicdata | cps_raw | wtfinl | float | 8 | | f | f |
publicdata | cps_raw | cpsidp | varchar(4000) | 4000 | | f | f |
publicdata | cps_raw | age | int | 8 | | f | f |
publicdata | cps_raw | sex | int | 8 | | f | f |
publicdata | cps_raw | race | int | 8 | | f | f |
publicdata | cps_raw | marst | int | 8 | | f | f |
publicdata | cps_raw | popstat | int | 8 | | f | f |
publicdata | cps_raw | nchild | int | 8 | | f | f |
publicdata | cps_raw | eldch | int | 8 | | f | f |
publicdata | cps_raw | yngch | int | 8 | | f | f |
publicdata | cps_raw | citizen | int | 8 | | f | f |
publicdata | cps_raw | hispan | int | 8 | | f | f |
publicdata | cps_raw | empstat | int | 8 | | f | f |
publicdata | cps_raw | labforce | int | 8 | | f | f |
publicdata | cps_raw | occ | int | 8 | | f | f |
publicdata | cps_raw | occ2010 | int | 8 | | f | f |
publicdata | cps_raw | occ1990 | int | 8 | | f | f |
publicdata | cps_raw | ind1990 | int | 8 | | f | f |
publicdata | cps_raw | occ1950 | int | 8 | | f | f |
publicdata | cps_raw | ind | int | 8 | | f | f |
publicdata | cps_raw | ind1950 | int | 8 | | f | f |
publicdata | cps_raw | classwkr | int | 8 | | f | f |
publicdata | cps_raw | uhrsworkt | int | 8 | | f | f |
publicdata | cps_raw | uhrswork1 | int | 8 | | f | f |
publicdata | cps_raw | uhrswork2 | int | 8 | | f | f |
publicdata | cps_raw | ahrsworkt | int | 8 | | f | f |
publicdata | cps_raw | ahrswork1 | int | 8 | | f | f |
publicdata | cps_raw | ahrswork2 | int | 8 | | f | f |
publicdata | cps_raw | absent | int | 8 | | f | f |
publicdata | cps_raw | durunem2 | int | 8 | | f | f |
publicdata | cps_raw | durunemp | int | 8 | | f | f |
publicdata | cps_raw | whyunemp | int | 8 | | f | f |
publicdata | cps_raw | whyabsnt | int | 8 | | f | f |
publicdata | cps_raw | whyptlwk | int | 8 | | f | f |
publicdata | cps_raw | wnftlook | int | 8 | | f | f |
publicdata | cps_raw | wnlook | int | 8 | | f | f |
publicdata | cps_raw | wkstat | int | 8 | | f | f |
publicdata | cps_raw | empsame | int | 8 | | f | f |
publicdata | cps_raw | multjob | int | 8 | | f | f |
publicdata | cps_raw | numjob | int | 8 | | f | f |
publicdata | cps_raw | paidemp1 | int | 8 | | f | f |
publicdata | cps_raw | paidemp1n | int | 8 | | f | f |
publicdata | cps_raw | paidemp2 | int | 8 | | f | f |
publicdata | cps_raw | paidemp2n | int | 8 | | f | f |
publicdata | cps_raw | profcert | int | 8 | | f | f |
publicdata | cps_raw | statecert | int | 8 | | f | f |
publicdata | cps_raw | jobcert | int | 8 | | f | f |
publicdata | cps_raw | wrkoffer | int | 8 | | f | f |
publicdata | cps_raw | nilfact | int | 8 | | f | f |
publicdata | cps_raw | actsame | int | 8 | | f | f |
publicdata | cps_raw | educ | int | 8 | | f | f |
publicdata | cps_raw | unionhh | int | 8 | | f | f |
publicdata | cps_raw | otpay | int | 8 | | f | f |
publicdata | cps_raw | vowhynot | int | 8 | | f | f |
publicdata | cps_raw | voynotreg | int | 8 | | f | f |
publicdata | cps_raw | votehow | int | 8 | | f | f |
publicdata | cps_raw | votewhen | int | 8 | | f | f |
publicdata | cps_raw | voreghow | int | 8 | | f | f |
publicdata | cps_raw | voreg95 | int | 8 | | f | f |
publicdata | cps_raw | voteres | int | 8 | | f | f |
publicdata | cps_raw | voteresp | int | 8 | | f | f |
publicdata | cps_raw | voted | int | 8 | | f | f |
publicdata | cps_raw | voreg | int | 8 | | f | f |
publicdata | cps_raw | vosuppwt | float | 8 | | f | f |
And here is publicdata.longterm_pums_data (which holds the ACS microdata):
Schema | Table | Column | Type | Size | Default | Not Null | Primary Key | Foreign Key
------------+--------------------+-------------+---------------+------+---------+----------+-------------+-------------
publicdata | longterm_pums_data | year | int | 8 | | f | f |
publicdata | longterm_pums_data | sample | int | 8 | | f | f |
publicdata | longterm_pums_data | serial | int | 8 | | f | f |
publicdata | longterm_pums_data | cbserial | varchar(4000) | 4000 | | f | f |
publicdata | longterm_pums_data | hhwt | float | 8 | | f | f |
publicdata | longterm_pums_data | cluster | varchar(4000) | 4000 | | f | f |
publicdata | longterm_pums_data | stateicp | int | 8 | | f | f |
publicdata | longterm_pums_data | statefip | int | 8 | | f | f |
publicdata | longterm_pums_data | countyicp | int | 8 | | f | f |
publicdata | longterm_pums_data | countyfip | int | 8 | | f | f |
publicdata | longterm_pums_data | strata | varchar(4000) | 4000 | | f | f |
publicdata | longterm_pums_data | gq | int | 8 | | f | f |
publicdata | longterm_pums_data | farm | int | 8 | | f | f |
publicdata | longterm_pums_data | pernum | int | 8 | | f | f |
publicdata | longterm_pums_data | perwt | float | 8 | | f | f |
publicdata | longterm_pums_data | sex | int | 8 | | f | f |
publicdata | longterm_pums_data | age | int | 8 | | f | f |
publicdata | longterm_pums_data | marst | int | 8 | | f | f |
publicdata | longterm_pums_data | birthyr | int | 8 | | f | f |
publicdata | longterm_pums_data | race | int | 8 | | f | f |
publicdata | longterm_pums_data | raced | int | 8 | | f | f |
publicdata | longterm_pums_data | hispan | int | 8 | | f | f |
publicdata | longterm_pums_data | hispand | int | 8 | | f | f |
publicdata | longterm_pums_data | citizen | int | 8 | | f | f |
publicdata | longterm_pums_data | speakeng | int | 8 | | f | f |
publicdata | longterm_pums_data | hcovany | int | 8 | | f | f |
publicdata | longterm_pums_data | hcovpriv | int | 8 | | f | f |
publicdata | longterm_pums_data | hinsemp | int | 8 | | f | f |
publicdata | longterm_pums_data | hcovpub | int | 8 | | f | f |
publicdata | longterm_pums_data | hinscaid | int | 8 | | f | f |
publicdata | longterm_pums_data | hinscare | int | 8 | | f | f |
publicdata | longterm_pums_data | educ | int | 8 | | f | f |
publicdata | longterm_pums_data | educd | int | 8 | | f | f |
publicdata | longterm_pums_data | empstat | int | 8 | | f | f |
publicdata | longterm_pums_data | empstatd | int | 8 | | f | f |
publicdata | longterm_pums_data | labforce | int | 8 | | f | f |
publicdata | longterm_pums_data | occ | int | 8 | | f | f |
publicdata | longterm_pums_data | occ2010 | int | 8 | | f | f |
publicdata | longterm_pums_data | ind | int | 8 | | f | f |
publicdata | longterm_pums_data | ind1990 | int | 8 | | f | f |
publicdata | longterm_pums_data | classwkr | int | 8 | | f | f |
publicdata | longterm_pums_data | classwkrd | int | 8 | | f | f |
publicdata | longterm_pums_data | inctot | int | 8 | | f | f |
publicdata | longterm_pums_data | ftotinc | int | 8 | | f | f |
publicdata | longterm_pums_data | versionhist | int | 8 | | f | f |
publicdata | longterm_pums_data | vetstat | int | 8 | | f | f |
publicdata | longterm_pums_data | vetstatd | int | 8 | | f | f |
publicdata | longterm_pums_data | tranwork | int | 8 | | f | f |
publicdata | longterm_pums_data | trantime | int | 8 | | f | f |
publicdata | longterm_pums_data | histid | varchar(4000) | 4000 | | f | f |
Thank you so much for your advice. I will give it a shot.
Hello again It seems that you didn't use any of the time series data in your model. Just the microdata. If so are you just using the time series data for demonstration purposes only?
Do you have any accuracy score you can share? Thanks
I ended up using a machine learning approach. I have not run it for all the census tracts yet but for the ones I ran, I got pretty much the same results as yours. This was a very fun project. Thank you for the inspiration.
Thank you for posting the model. it is great. I do have three questions.