PolicyEngine / policyengine-us

The PolicyEngine US Python package contains a rules engine of the US tax-benefit system, and microdata generation for microsimulation analysis.
https://policyengine.org/us
GNU Affero General Public License v3.0
97 stars 168 forks source link

Resolve integration test failures for 2021 taxes #993

Open martinholmer opened 2 years ago

martinholmer commented 2 years ago

Methods used to generate results in the table below are discussed in this topic:

And itemized deduction results are discussed in this topic:

In the table below, the known issues blocking progress are:

Unknown causes of other differences are under investigation.

Number of tax units (out of random samples of 100,000) with
an income tax difference between PolicyEngine-US and TAXSIM35
- p-x & e-k/m sample assumptions are described below this table
- simplest sample to have differences is shown in this table
- what a 'difference' means is explained below this table
- :loop indicates use of optional testing logic (see #2389)
- NONE* indicates a situation described in topic #2570
-----------------------------------------------------------------
: MODEL VERSIONS :   :         MODEL          :   :  CAUSES OF  :
PolEngUS  TAXSIM35   :      DIFFERENCES       :   : DIFFERENCES :
+ branch  + patch    REGION   SAMPLE  NUM.DIFFS   RELEVANT.ISSUES
-----------------------------------------------------------------
0.779.0   12/30/22   US          x21          0   NONE
master  2024-05-31   AK                           NO INCOME TAX
                     AL:loop     k21          0   NONE
                     AR          k21       8547   #4586, others?
                     AZ:loop     k21          0   NONE
                     CA          k21        309   #4120, others?
                     CO:loop     k21          0   NONE
                     CT          k21          0   NONE
                     DC:loop     m21          0   NONE*
                     DE          k21          0   NONE
                     FL                           NO INCOME TAX
                     GA          m21          0   NONE*
                     HI          k21          0   NONE
                     IA:loop     k21          0   NONE
                     ID:loop     k21          0   NONE
                     IL          x21          0   NONE
                     IN          x21          0   NONE
                     KS          x21          0   NONE
                     KY          k21          0   NONE
                     LA          g21       2736   #4517, others?
                     MA          x21      CRASH   #4549, others?
                     MD:loop     m21          0   NONE*
                     ME:loop     m21          0   NONE*
                     MI          k21          0   NONE
                     MN          x21          0   NONE
                     MO:loop     m21          0   NONE*
                     MS          k21       3758   #4486, others?
                     MT          f21      16480   ?
                     NC          k21          0   NONE
                     ND:loop     k21          0   NONE
                     NE:loop     m21          0   NONE*
                     NH          x21          0   NONE
                     NJ          q21        134   #4475, others?
                     NM          m21          0   NONE*
                     NV                           NO INCOME TAX
                     NY          e21      46937   #4427, others?
                     OH:loop     k21      CRASH   #4549, others?
                     OK:loop     m21          0   NONE*
                     OR:loop     k21          0   NONE
                     PA          x21          0   NONE
                     RI          k21          0   NONE
                     SC          k21          0   NONE
                     SD                           NO INCOME TAX
                     TN                           NO INCOME TAX
                     TX                           NO INCOME TAX
                     UT:loop     k21          0   NONE
                     VA:loop     m21          0   NONE*
                     VT:loop     k21          0   NONE
                     WA          x21          0   NONE
                     WI          x21          0   NONE                         
                     WV:loop     k21          0   NONE
                     WY                           NO INCOME TAX
-----------------------------------------------------------------

Sample assumptions in two lists (each ordered from simple to more complex):

(1) The p-through-x assumption sequence used for US and for some states:

p21: 2021 tax units consisting of married couples and single individuals with up to four dependents, with each tax unit having wages as the only source of income and no expenses of any kind.

q21: like p21 sample except adds social security benefits as another source of income and adds three types of expenses: rent paid, local property taxes, and mortgage interest.

r21: like q21 sample except adds taxable interest as an additional source of income and adds childcare expenses.

s21: like r21 sample except adds short-term capital gains (but not losses) as an additional source of income.

t21: like s21 sample except adds long-term capital gains (but not losses) as an additional source of income.

u21: like t21 sample except adds qualified dividends as an additional source of income.

v21: like u21 sample except adds taxable pension benefits as an additional source of income.

w21: like v21 sample except adds rent received as an additional source of income.

x21: like w21 sample except adds self-employment income that is from either a qualified SSTB or a qualified non-SSTB, subtracts childcare expenses, and reassigns qualified dividends, short-term capital gains, and long-term capital gains, to taxable interest income.

(2) The e-through-k/m assumption sequence used for other states:

e21: like p21 sample except specifies $11,010 in taxable interest income for every tax unit (in order to side-step complex state EITC decoupling rules).

f21: like e21 sample except adds social security benefits and taxable pension benefits as additional sources of income.

g21: like f21 sample except adds rent paid and childcare expenses.

h21: like g21 sample except adds rent received and short-term capital gains (but not losses) as additional sources of income.

i21: like h21 sample except adds qualified dividends and long-term capital gains (but not losses) as additional sources of income.

j21: like i21 sample except adds self-employment income that is from either a qualified SSTB or a qualified non-SSTB, and subtracts childcare expenses.

k21: like j21 sample except adds local property taxes and mortgage interest as expenses.

m21: like k21 sample except subtracts self-employment income and doubles wages.

Definition of Difference:

A difference is generally defined to mean taxes from the two models being more than one cent apart. Exceptions are made when inflation indexing assumptions in the two models are not the same or when numerical precision differences between the two models cause rounding error. The following situations use a definition of difference that is larger than one cent:

US x21 sample: three cents (single vs double precision)

AZ all samples: eight cents

IA k21 sample: twenty cents

IN all samples: six cents

MA x21 sample: two dollars

MI all samples: forty-five cents

MO all samples: four cents

ND j21 and k21 samples: thirty-two cents

NY all samples: nine cents

OR all samples: eleven cents

RI all samples: seven cents

VT h21 through k21 samples: seven dollars (single vs double precision)

WI all samples: eight cents

MaxGhenis commented 2 years ago

Thanks @martinholmer for these valuable numbers. I edited the title to reflect our goal. If you update the numbers for new versions, could you do so in this issue thread?

nikhilwoodruff commented 2 years ago

@martinholmer no (but I don't usually get GitHub email notifications, just the website notifications).

nikhilwoodruff commented 2 years ago

Thanks for this progress @martinholmer! Nice to see the zeros for these samples.