PSLmodels / tax-microdata-benchmarking

A project to develop a benchmarked general-purpose dataset for tax reform impact analysis.
https://pslmodels.github.io/tax-microdata-benchmarking/
2 stars 6 forks source link

Data examination results for pension contributions #82

Closed martinholmer closed 4 months ago

martinholmer commented 6 months ago

The amount of defined-contribution (DC) pension contributions (pencon_p for the tax unit head and pencon_s for the tax unit spouse when married filing jointly) seems too low, primarily because very few people in our most recent dataset have a positive value for these two variables.

Here is a tabulation of the tmd.csv file (for 2021) that is being used to generate the most recent examination results:

% awk -F, 'NR==1{for(i=1;i<=NF;i++)print i,$i}' tmd.csv | grep -e pencon -e s006
11 s006
50 pencon_p
51 pencon_s

% awk -F, 'NR==1{next}{t++}$50>0{n++}END{print t,n,n/t}' tmd.csv 
233412   6789   0.0290859                 <--- UNWEIGHTED HEADS (#)

% awk -F, 'NR==1{next}{w=$11;t+=w}$50>0{n+=w}END{print t*1e-6,n*1e-6,n/t}' tmd.csv
219.594   15.171   0.0690865              <--- WEIGHTED HEADS (#M)

% awk -F, 'NR==1{next}{w=$11;t+=w}$50>0{n+=w}$51>0{n+=w}END{print t*1e-6,n*1e-6,n/t}' tmd.csv
219.594   16.0148   0.0729291             <--- WEIGHTED PEOPLE (#M)

% awk -F, 'NR==1{next}{w=$11;t+=w;c+=w*($50+$51)}END{print t*1e-6,c*1e-9}' tmd.csv
219.594   48.7986                         <--- WEIGHTED DOLLAR CONTRIBUTONS ($B)

The 16.0 million people with positive DC contributions compares with USDOL Form 5500 results for 2020 "active participants" of nearly 85.3 million:

Screenshot 2024-05-11 at 4 36 50 PM

.

And the $48.8 billion tabulation compares with the USDOL Form 5500 results for 2020 of almost $586 billion shown above. So, we have less than ten percent of DC pension contributions. Even if the DOL contribution total includes both employee and employer DC contributions, the employee DC contribution amounts in the tmd.csv file seem too low.

martinholmer commented 6 months ago

Better targets for years up through 2018 are from IRS-SOI tabulations of W-2 forms.

The 2018 tabulations of taxpayers with an employee pension contributions are:

Number of taxpayers (#M)               60.353
Gross (Medicare) earnings ($B)       5062.371
Employee pension contributions ($B)   332.520

So, the 16.0 million taxpayers in 2021 tabulated using the tmd.csv file is clearly too low. Also, the $48.8 billion in 2021 employee pension contributions tabulated using the tmd.csv file is way below the actual 2018 value of almost $333 billion.

It does appear that @donboyd5 was correct to highlight this issue in #8.

The 2011 IRS-SOI W-2 tabulations were used in the taxdata repository's impute_pencon.py module to impute pencon_p and pencon_s values to 2011 PUF data. The 2015 IRS-SOI tabulations could be used to impute more accurate pension contributions to the 2015 PUF.

donboyd5 commented 6 months ago

This is extremely helpful. Thanks.

martinholmer commented 4 months ago

After the merge of PR #90, the weighted sum of the pencon_p and pencon_s variables is much larger. So, the original issue (that pension contributions were way too low) has now been resolved.