PSLmodels / taxdata

The TaxData project prepares microdata for use with the Tax-Calculator microsimulation project.
http://pslmodels.github.io/taxdata/
Other
21 stars 30 forks source link

Fix PUF SOI estimates #411

Open andersonfrailey opened 2 years ago

andersonfrailey commented 2 years ago

This PR addresses issue #399.

The updatesoi.py file is used to automatically update our SOI estimates, but the range of indicies used to add up total wages for those with AGI greater than $1 million was off and excluded some of the data, as @donboyd5 figured out. This PR fixes that and adds an additional check to updatesoi.py to prevent an issue like this from going undetected in the future.

The bug affected our SOI estimates for 2015-2017. They have all been fixed in this PR.

andersonfrailey commented 2 years ago

This PR is just about done, but with the changes there's a big increase in tax liability for 2030 and 2031 that I can't explain. I've attached a table comparing the tax liabilities that were found with taxcalc 3.2.1.

Screen Shot 2022-01-10 at 2 58 31 PM
donboyd5 commented 2 years ago

Would it be hard to construct a table for two years, each with total wages by an income classifier (e.g., AGI range) pre-PR and with PR, one for (a) the year before the first surprising year (i.e., 2029), and one for the first surprising year (2030)?

For example, tables a and b would have stubs such as:

income range wages pre-PR wages PR change <= $0 ... $1m-10m $10m+ sum

We'd then see (1) which income ranges are causing the problem, and (2) how those income ranges changed between 2029 and 2030. Of course it could be something else, but this seems like the most likely suspect.

If we verify that the issue is caused by changes in wages, and we figure out which income ranges are at work (probably the top 2 or 3), it would then make sense to look at growfactors moving from 2029 and 2030, and at wage targets for 2029 and 2030. My guess (uninformed) is that there is something surprising about the targets in 2029 but it could be growfactors. Of course it could be something else entirely, such as # returns with wages (again the same sort of breakdowns with #s would be helpful), or even something out of left field, but I'd suggest looking at wages first.

Don

On Mon, Jan 10, 2022 at 3:58 PM andersonfrailey @.***> wrote:

This PR is just about done, but with the changes there's a big increase in tax liability for 2030 and 2031 that I can't explain. I've attached a table comparing the tax liabilities that were found with taxcalc 3.2.1. [image: Screen Shot 2022-01-10 at 2 58 31 PM] https://user-images.githubusercontent.com/20684675/148838484-adc94792-cae2-45ff-b5f8-ca7580441885.png

— Reply to this email directly, view it on GitHub https://github.com/PSLmodels/taxdata/pull/411#issuecomment-1009336259, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABR4JGDDQFKS2B2US6BRGW3UVNCBFANCNFSM5LUHK6NQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

andersonfrailey commented 2 years ago

Deleted my last comment, Realized there was an issue with how I was looking at the data when I was making those charts

jdebacker commented 2 years ago

@andersonfrailey Is this PR ready for review? Last comment made that unclear. Also, could be helpful to produce tables Don suggested to aid in review.

andersonfrailey commented 2 years ago

@jdebacker, I haven't been able to fix the issue with tax revenues jumping up in the last few years yet, but I'd definitely be open to others seeing if they get the same result when they run the changes in this PR. Spring break starts tomorrow so I should be able to work on those tables Don suggested this week!

jdebacker commented 2 years ago

Great - thanks for the update!

MattHJensen commented 2 years ago

@andersonfrailey, I ran make puf-files with this PR and hit several ITERATION_LIMIT and INFEASIBLE terminations during stage 2. Is this to be expected? Terminal output in this gist.

I'm going to push forward to replicate your revenue table and then create Don's suggested tables in the next few days, but thought I should check in on this. Thanks!

donboyd5 commented 2 years ago

A few thoughts:

2016: image

image

2029 image

2030 (shown) and 2031 image

I haven't looked at the targets but my intuition would be to look at the way the correction to the wage targets was implemented to see if the full set of new targets is internally inconsistent. This might be the underlying issue, and perhaps also causing undesirable results in other years, too, even though no flags were raised.

donboyd5 commented 2 years ago

@MattHJensen Inconsistent targets could arise if, for example:

There are ways to be inconsistent, too, but these are obvious ones.

jdebacker commented 1 year ago

@MattHJensen Can you test this branch again with the latest changes and see if you still get that error with iteration limits?

jdebacker commented 1 year ago

An update on this branch. I've tested this with the latest versions of Julia and Tulip and still get an iteration limit hit in 2029 and "infeasible" after that.

Will look into targets in more detail next, but was hoping a new solver would do the trick...

cc @andersonfrailey @donboyd5 @MattHJensen