PSLmodels / OG-USA

Overlapping-generations macroeconomic model for evaluating fiscal policy in the United States
https://pslmodels.github.io/OG-USA/
Creative Commons Zero v1.0 Universal
19 stars 34 forks source link

Household structure variables from the PSID #6

Open jdebacker opened 4 years ago

jdebacker commented 4 years ago

@MaxGhenis and @prrathi are working on an estimate of the number of children per household. The current version of the psid_download.R script in this repo pulls a few related variables:

                        # Demographics
                        head_age="ER17013",
                        spouse_age="ER47319",
                        head_gender="ER47318",
                        head_num_children="V10977",
                        num_children="ER37724",
                        num_children_away_from_home="V561",
                        num_children_under18="ER47320",

Are there additional variables you two would like pulled? FYI, PSID variable search is here.

MaxGhenis commented 4 years ago

Thanks @jdebacker, what's currently in there should be a good start. Do you know if PSID family units also include elderly dependents? I couldn't find fields on this, and seems like the biggest potential gap.

The cleanest would probably be using individual ages and then merging back to the family unit level, but might not be worth it.

MaxGhenis commented 4 years ago

To be specific, here's how I think we can calculate each metric used for the UBI:

nu18 = (head_age < 18) + (spouse_age < 18) + num_children_under18
# Temp: assume all adult children are under 65.
num_adult_children = num_children - num_children_under18
n1864 = head_age.between(18, 64) + spouse_age.between(18, 64) + num_adult_children
n65 = (head_age > 64) + (spouse_age > 64)

Another consideration is that taxcalc currently also has a n1820 for UBI policies that give a different amount to this age group. Parity would require using individual ages. I don't think this is too important though, it's probably just for one semi-prominent UBI proposal (Charles Murray's).

MaxGhenis commented 3 years ago

num_children is null for 98% of records, so we might have to just use num_children_under18 and skip adult dependents, at least to start.

!wget https://github.com/jdebacker/OG-USA-Calibration/raw/master/EarningsProcesses/psid_data_files/psid1968to2015.RData
psid = pyreadr.read_r('psid1968to2015.RData')['psid_df']
psid[['head_age', 'spouse_age', 'num_children_under18', 'num_children']].isna().mean()
head_age                0.000000
spouse_age              0.000000
num_children_under18    0.000000
num_children            0.982549
dtype: float64
MaxGhenis commented 3 years ago

I think we can close this and discuss in #9.