PSLmodels / taxdata

The TaxData project prepares microdata for use with the Tax-Calculator microsimulation project.
http://pslmodels.github.io/taxdata/
Other
19 stars 30 forks source link

FYI: Comparisons of IRS summary statistics for 2017 to puf.csv with: (1) default stages 1, 2, and 3, (2) custom stage 1 growfactors and default stage 2, and (3) version #2 reweighted #389

Open donboyd5 opened 3 years ago

donboyd5 commented 3 years ago

You will find the 3 summary report files here. They should be visible to all.

All comparisons are for filers only, as defined in PSLmodels/Tax-Calculator#2501 prior to today (2020-11-06). @MattHJensen suggested an improvement to the filer definition which I will implement relatively soon but I don't think the results will change much.

Each summary report file has 3 sections:

  1. Summary section that compares file totals (for tax filers)
  2. Detailed section that compares results by income range
  3. Documentation section that shows how I mapped puf variables to irs concepts

The files are:

  1. irs_pufdefault_comparison.txt: this compares official puf.csv from 2020-08-20, using default (tax-calculator built-in) methods for stage 1 growfactors, stage 2 weights, and stage 3 interest income adjustments to grow to 2017.

  2. irs_pufregrown_comparison.txt: official puf.csv preprocessed with custom growfactors, using default stage 2 weights, and NOT using stage 3 interest income adjustments:

  1. irs_pufregrown_reweighted_comparison.txt: official puf.csv preprocessed with custom growfactors, starting out with default stage 2 weights, and NOT using stage 3 interest income adjustments, THEN reweighted in an effort to come close to many IRS totals. In general, I tried to target all 34 variables summarized in the 3 files, but I made a few exceptions for the 4 lowest income ranges where some of the puf values were very far from irs values and I was concerned that trying to hit IRS values would cause damage to other variables. In total there were ~ 32 variables x 18 income ranges, or ~ 576 weights (+/- a little). FWIW, the solution took 4.5 seconds.

A few comments and observations:

My next steps will be to try to apportion national weights on this file to a selected set of states.

Then, as I learn about what may be wrong with this national file, I will go back and try to improve it; with the first iteration of that I will fix up the filer-determination code as pointed out by @MattHJensen in PSLmodels/Tax-Calculator#2501.

Many thanks for any criticism you can provide.

MattHJensen commented 3 years ago

@donboyd5, thanks very much for these summaries. I am looking through them now. This certainly seems promising.

If you could share the regrown-reweighted file, I'd appreciate that. I'd like to poke around with what's happening to non-targeted variables and how much the weights are moving.

donboyd5 commented 3 years ago

Thanks much, @MattHJensen, for looking. Please see email I sent with link to the folder.

donboyd5 commented 3 years ago

The weights change quite a bit. I capped the ratio of new weights to old weights at 50. Here is quick distribution:

image

donboyd5 commented 3 years ago

@MattHJensen, I put 3 more files in the folder available to people with access to puf.csv:

MattHJensen commented 3 years ago

I put 3 more files in the folder available to people with access to puf.csv:

I saw the extra files and appreciate your having included them. Thanks for the overview here, too.

The weights change quite a bit. I capped the ratio of new weights to old weights at 50. Here is quick distribution:

This replicated on my machine, which is a nice check I've got the right files.

It looks like 4,216 records are unused in the reweighted file, contributing to the ratios of 0 on the left hand of the distribution. (filers.s006_rwt == 0).sum().

In the past, the TaxData project has sought to minimize changes in weights (as you know) and has selected targets parsimoniously to avoid distorting the relationships among variables from their manifestations in the base data. So it is a departure to target more thoroughly and allow weights to move more freely. But there is a great deal to be said for hitting a broader set of SOI targets. It's also not clear that relationships in the the base data year are as meaningful as they used to be given elapsed time and policy changes. All of that is just to say, as the maintainer of a project that requires tax data extrapolations as inputs -- I'm really enjoying looking at these comparisons and thinking about the tradeoffs..

@chusloj and @andersonfrailey have been thinking about how to move TaxData's stage 3 interest adjustment into stage 2, so they may be interested to see how an alternative approach to setting up the problem could potentially make it easier.

donboyd5 commented 3 years ago

Thanks for looking, @MattHJensen. I agree: 1) It is attractive to keep the changes in weights minimal 2) And less attractive the further we get from the base year, on the assumption that changes in the economy, changes in tax law, and changes in behavior will make 2011 relationships less relevant the further we are from 2011 3) And less attractive if important variables that affect the revenue and distributional impacts of tax policies will be far off if we keep weight changes minimal

The question is, how much is too much? With sufficient time, we could work both ends toward the middle. Define a set of variables we care about, and define "correct" values for those variables (e.g., the IRS published totals). Then:

1) From one end, start with a minimal set of variables to target that we suspect are the most important (e.g., agi and # of returns by income range crossed by at least 2 marital statuses), and see how bad the important untargeted variables are. Supplement this by running tax-calculator to get (a) tax liability under current law, and see how far it (or several related variables) are from what we expect to be true (in total and by agi range and other cuts), and (b) changes in tax liability under an important policy alternative, and see how far off this is from what we expect from other sources (JCT? TPC? taxdata default?). Also evaluate how much weights had to change to hit/approximate these targets. It is possible to put restrictions on how much they change. When doing that, two things can happen: (i) we might still hit the targets by making a lot of lesser changes in other weights, and that might be desirable, or (ii) we might fall further from some targets, and we'd have to evaluate whether that's an acceptable compromise. We might be glad to be 5% off for an unimportant target if it means that we don't have to jerk weights around so much. But we might not want to be 5% off for agi or number of returns. It is also possible, in theory, to prioritize targets - e.g., put a weight of 1 on hitting agi targets in income ranges 9 and 10, and a weight of .8 on ranges 1 and 2, and a weight of only 0.2 on taxable interest in any income range. That seems like an important extension. I have done variants of this. Of course, then you need to quantify judgments about what's most important and what's not.

Then, add next most important variable and repeat.

2) From the other end start with a maximal set of targeted variables and do the same evaluation. Then drop the least important variable and repeat.

After a few iterations, we'd probably get a good sense of where the happy medium is. We might even be able to create some rules of thumb that help us make our judgments transparent and repeatable.

My problem is that right now I'm racing the clock and can't iterate much. Many of the variables I targeted seem essential for either evaluating tax policies or for apportioning weights across states. I do hope to do some analyses comparing tax calculations on the default puf and a regrown reweighted puf, and also on at least one policy variant, although it will depend on how fast I am at other things. And if you learn that this has just wrenched the data around too much and is leading to implausible results in some areas in comparison to the default puf, that would be really valuable to know.

donboyd5 commented 3 years ago

I should add that some of the iteration could result in us learning that some growfactors could be better. In my mind, the 2017 IRS national data gives us valuable information on how average values changed between 2011 and 2017 that could be used to modify growfactors. It is possible to be far more rigorous than I have been and that would be valuable. I just modified a few growfactors that seemed very important and where growth in IRS data was significantly different than what existing growfactors implied.

MattHJensen commented 3 years ago

After a few iterations, we'd probably get a good sense of where the happy medium is. We might even be able to create some rules of thumb that help us make our judgments transparent and repeatable.

Yes, this makes sense. As does, for a time-constrained new project, picking a sensible starting point and then looking for problems. I'm having fun with the data so far and will report back here with anything that seems of interest.

I should add that some of the iteration could result in us learning that some growfactors could be better. In my mind, the 2017 IRS national data gives us valuable information on how average values changed between 2011 and 2017 that could be used to modify growfactors.

This makes sense as well, and I suspect TaxData maintainers agree too. Even before this, it would probably be good to refine TaxData's filer identification strategy.

donboyd5 commented 3 years ago

@MattHJensen please see updated filers function in PSLmodels/Tax-Calculator#2501. Will plan to use in next iteration of reweighting, early next week, and welcome comments.

jdebacker commented 3 years ago

@donboyd5 @MattHJensen I am unsure of the status of this issue, but should it be moved over to the TaxData repo?

donboyd5 commented 3 years ago

@jdebacker @MattHJensen Yes, I think that makes sense. After taxdata is updated again soon, I hope to update some of this analysis.