PSLmodels / taxdata

The TaxData project prepares microdata for use with the Tax-Calculator microsimulation project.
http://pslmodels.github.io/taxdata/
Other
21 stars 30 forks source link

Fix the report #415

Closed bodiyang closed 2 years ago

bodiyang commented 2 years ago

This PR aims to solve the problem in the CBO baseline update work, that the CPS projected tax liability shows no change with the update of CBO baseline.

After the investigation, the reason causing this problem is that the updated new grow factor was not adopted to construct the new record class in report.py.

jdebacker commented 2 years ago

A question I had, and maybe @andersonfrailey can answer this, does one want the new grow factors in the comparison of projections? There is already a comparison of the changes in the grow factors earlier in the report. Are the differences in tax liability to look at the effects of changes in the weights (holding constant the grow factors)? Or should, as @bodiyang is proposing, the liability differences reflect both changes in grow factors and in weights?

bodiyang commented 2 years ago

@andersonfrailey the updated taxdata_report_2022-08-16.pdf after adjusting the grow factor

bodiyang commented 2 years ago

report with puf projectionstaxdata_report_2022-08-17.pdf

jdebacker commented 2 years ago

I am surprised at how off the PUF estimates are in the latest report above (when looking at the "Current" lines). So I computed revenue with the PUF + taxcalc and got much more reasonable numbers. Not sure of the difference, but here's what I did:

# install and import
! pip install taxcalc
import taxcalc
import pandas as pd
# taxcalc setup
pol = taxcalc.policy.Policy()
rec = taxcalc.records.Records(data="/Users/jason.debacker/repos/tax-calculator/puf.csv")
tc_base = taxcalc.calculator.Calculator(policy=pol, records=rec)
# run taxcalc and put rev estimates in table
rev_base = {'IIT': {}, 'Payroll': {}}
for t in range(2021, 2032):
    tc_base.advance_to_year(t)
    tc_base.calc_all()
    rev_base['IIT'][t] = tc_base.weighted_total('iitax')
    rev_base['Payroll'][t] = tc_base.weighted_total('payrolltax')
# format table of revenue estimates
rev_base_df = pd.DataFrame.from_dict(rev_base).T
pd.options.display.float_format = '${:.3f}'.format
rev_base_df['2021-2031'] = rev_base_df.sum(axis=1)
rev_base_df / 1000000000
The result: 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2021-2031
IIT 1198.63 1825.05 1899.51 1995.44 2099.86 2442.38 2519.62 2619.86 2719.81 2821.36 2930.06 25071.6
Payroll 1250.82 1306.3 1362.3 1422.8 1486.19 1548.82 1607.98 1667.99 1728.66 1790.42 1856.27 17028.6
bodiyang commented 2 years ago

I am surprised at how off the PUF estimates are in the latest report above (when looking at the "Current" lines). So I computed revenue with the PUF + taxcalc and got much more reasonable numbers. Not sure of the difference, but here's what I did:

# install and import
! pip install taxcalc
import taxcalc
import pandas as pd
# taxcalc setup
pol = taxcalc.policy.Policy()
rec = taxcalc.records.Records(data="/Users/jason.debacker/repos/tax-calculator/puf.csv")
tc_base = taxcalc.calculator.Calculator(policy=pol, records=rec)
# run taxcalc and put rev estimates in table
rev_base = {'IIT': {}, 'Payroll': {}}
for t in range(2021, 2032):
    tc_base.advance_to_year(t)
    tc_base.calc_all()
    rev_base['IIT'][t] = tc_base.weighted_total('iitax')
    rev_base['Payroll'][t] = tc_base.weighted_total('payrolltax')
# format table of revenue estimates
rev_base_df = pd.DataFrame.from_dict(rev_base).T
pd.options.display.float_format = '${:.3f}'.format
rev_base_df['2021-2031'] = rev_base_df.sum(axis=1)
rev_base_df / 1000000000

The result:

2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2021-2031 IIT 1198.63 1825.05 1899.51 1995.44 2099.86 2442.38 2519.62 2619.86 2719.81 2821.36 2930.06 25071.6 Payroll 1250.82 1306.3 1362.3 1422.8 1486.19 1548.82 1607.98 1667.99 1728.66 1790.42 1856.27 17028.6

I agree, this current/old IIT and Payroll looks more reasonable. Looks like 3% to 4% difference between the old and the new. ~ So the problem in the report is how report.py construct that specific current/old calculator class.