PSLmodels / taxdata

The TaxData project prepares microdata for use with the Tax-Calculator microsimulation project.
http://pslmodels.github.io/taxdata/
Other
19 stars 30 forks source link

FYI: Surprising relationship between total Social Security and taxable Social Security #388

Open donboyd5 opened 3 years ago

donboyd5 commented 3 years ago

I don't think the following issue is going to lead me to make adjustments in the short term, at least not until I understand it better, but I wanted to leave a record of it in case taxdata and tax-calculator developers are interested. I would be very interested in understanding what might cause the results I describe below.

Apparently we have far too much total Social Security income (e02500) in 2017 (+30% vs. IRS) because we have far too many weighted records with this income, although average Social Security income on those returns is about right. Thus, it is not really a growfactor issue, but maybe a weighting issue. Here is a table comparing the values.

image

However, oddly, or at least surprisingly to me, this does not seem to have created bad results for taxable Social Security income (c02500). Here is a table comparing these values to the IRS:

image

As you can see, they come pretty close in both the number of weighted records that have this income and the average values. On the one hand, that is comforting, but on the other hand it is a little concerning and it would be nice to understand what is going on under the hood that takes base data that seems incorrect and produces a final result that seems correct.

In addition, the distribution of values across income ranges is not what we would want. The first table below shows the distribution of the number of returns with taxable Social Security in IRS published tables and in the puf as I have advanced it, and the second table shows the distribution of the total values. There are some good-sized differences within income ranges. I'll do some reweighting to try to bring the puf closer to the IRS distribution.

Anyway, I wanted to raise this issue and would much appreciate comments.

image

image

MattHJensen commented 3 years ago

@donboyd5, thanks a lot for investigating this. I am just going through your recent issues. I'll offer quick reactions first, and then I hope to follow up again soon with any further ideas or investigation.

For anyone following this who isn't familiar with the context of Don's work, I'll note that these questions pertain to the use of the Tax-Calculator with the puf.csv file, which is produced/maintained over in the TaxData project.

Apparently we have far too much total Social Security income (e02500) in 2017 (+30% vs. IRS) because we have far too many weighted records with this income...

However, oddly, or at least surprisingly to me, this does not seem to have created bad results for taxable Social Security income (c02500).

Quick reaction:

Social security taxable income is the social security income that falls above the taxable thresholds. It is created here and the first threshold is shown here.

(The thresholds are based on a broader income measure than AGI, notably excluding above the line deductions, which is why in the SOI tables we see returns with taxable SS income in AGI bins below the threshold.)

Are many of the excess returns below the SS threshold? That would would lead to a situation with too many SS-income-returns and too much SS income overall but with SS taxable income and the number of returns with SS taxable income closer to SOI.

jdebacker commented 3 years ago

@donboyd5 @MattHJensen Should this issue be moved to TaxData?

donboyd5 commented 3 years ago

@jdebacker @MattHJensen Yes, it does seem like a taxdata issue to me.