PSLmodels / taxdata

The TaxData project prepares microdata for use with the Tax-Calculator microsimulation project.
http://pslmodels.github.io/taxdata/
Other
19 stars 30 forks source link

Apply stochastic imputation to split income between spouses in PUF #432

Open MaxGhenis opened 1 year ago

MaxGhenis commented 1 year ago

The PUF currently splits income between spouses by taking the average split from the CPS by income source.

By compressing the heterogeneity in income splits across couples, this results in significant underestimation of the impact of reforms that individualize tax programs. For example, we found that it understated the cost of Scott Winship's proposal to individualize the EITC by about 2/3: https://policyengine.org/us/blog/winship-individualized-eitc

I'd suggest applying some sort of stochasticity to this imputation. Other imputations are stochastic in some way, e.g. by selecting a value depending on the mean and standard deviation of a distribution. PolicyEngine uses quantile regression forests instead, which I've found to be more accurate. But I'd expect that stochasticity will be more important for this issue than, for example slicing the data more granularly, which would still compress the distribution.

jdebacker commented 1 year ago

Interesting analysis and finding re the income splits @MaxGhenis!

In the article, "the Tax-Calculator project determines the average split of income between filer and spouse from the CPS, and applies that equally to PUF records." is not quite accurate -- the taxdata project does this split. Tax-Calculator just computes things tax liability given some data.