Open jdebacker opened 4 years ago
Thanks @jdebacker, what's currently in there should be a good start. Do you know if PSID family units also include elderly dependents? I couldn't find fields on this, and seems like the biggest potential gap.
The cleanest would probably be using individual ages and then merging back to the family unit level, but might not be worth it.
To be specific, here's how I think we can calculate each metric used for the UBI:
nu18 = (head_age < 18) + (spouse_age < 18) + num_children_under18
# Temp: assume all adult children are under 65.
num_adult_children = num_children - num_children_under18
n1864 = head_age.between(18, 64) + spouse_age.between(18, 64) + num_adult_children
n65 = (head_age > 64) + (spouse_age > 64)
Another consideration is that taxcalc
currently also has a n1820
for UBI policies that give a different amount to this age group. Parity would require using individual ages. I don't think this is too important though, it's probably just for one semi-prominent UBI proposal (Charles Murray's).
num_children
is null for 98% of records, so we might have to just use num_children_under18
and skip adult dependents, at least to start.
!wget https://github.com/jdebacker/OG-USA-Calibration/raw/master/EarningsProcesses/psid_data_files/psid1968to2015.RData
psid = pyreadr.read_r('psid1968to2015.RData')['psid_df']
psid[['head_age', 'spouse_age', 'num_children_under18', 'num_children']].isna().mean()
head_age 0.000000
spouse_age 0.000000
num_children_under18 0.000000
num_children 0.982549
dtype: float64
I think we can close this and discuss in #9.
@MaxGhenis and @prrathi are working on an estimate of the number of children per household. The current version of the
psid_download.R
script in this repo pulls a few related variables:Are there additional variables you two would like pulled? FYI, PSID variable search is here.