Estimate tax functions from microdata that are monotonic in labor and capital income

rickecon commented 8 years ago

We found that multiple equilibria resulting from multiple local optima result with our originally estimated tax functions. We estimated these functions to fit ETR, MTRx, and MTRy data from Tax-Calculator. This issue tracks the development and testing of a new functional form for the tax functions that fits the data well and has the properties necessary for a unique equilibrium that goes to the steady-state in the long run over the time path of the economy.

rickecon commented 8 years ago

First, we found good evidence that a functional form for tax rates that is monotonically increasing in labor income (x) and capital income (y), respectively, is not too strong an assumption. The following figures are two separate perspectives of the same scatterplot of effective tax rates (ETRs) as a function of labor income (x) and capital income (y) from Tax-Calculator for 43-year-olds in the year 2016. etr_age_43_year_2016_data etr_age_43_year_2016_data2

The following four figures are the same two perspectives of the marginal tax rate of labor income and the marginal tax rate of capital income, respectely, both plotted as functions of both labor income (x) and capital income (y).

mtrx_age_43_year_2016_data

mtrx_age_43_year_2016_data2

mtry_age_43_year_2016_data

mtry_age_43_year_2016_data2

The blue figure below is a 3D histogram showing that all the mass for these points is for low capital income with a skewed fat-right-tailed distribution over labor income. This density plot applies to all of the scatterplots above. hist_age_43_year_20162

rickecon commented 8 years ago

The original tax function that we estimated looked like a good fit, but was not monotonically increasing in labor income (x) and capital income (y). The functional form was the following:

rickecon commented 8 years ago

I just finished trying a CES functional form that combines two univariate tax rate functions--tau(x) and tau(y)--both of which are monotonically increasing in their respective variables. The functional form was the following

This did not work because it is not well identified. The elasticity parameter epsilon changes the curvature of the fitted curve. This is also done by the parameters in the ratios of polynomials in tau(x) and tau(y). For this reason, I would get very different estimated parameters for whatever different starting values I chose.

Next, I will try the Cobb-Douglas case of the CES function where epsilon = 1.

jdebacker commented 8 years ago

The "C" parameter is not the same in \tau(x) and \tau(y), correct?

Also, can you normalize the constant to 1 in this specification? Then adjust the other parameters accordingly? It seems like you should be able to do this for \tau(x) or \tau(y) (and maybe both since alpha can adjust accordingly). That might help with the identification.

rickecon commented 8 years ago

@jdebacker Yes. That is a typo above. The F parameter in the first specification, and the C parameters in the specifications above should be 1. I just edited both of those.

kerkphil commented 8 years ago

If I use the following parameters with a CES, I get the following graph which looks close and might be a good starting place for a minmizer.

A=D=1E-10, B=E=.001, minx=-.2, miny=-.4, maxx=.5, maxy=.4, alf=.67, eta = (elas -1)/elas = 2.

rickecon commented 8 years ago

@kerkphil How do you know how well that surface fits the data? If you have a population weighted sum of squared deviations, we can compare across runs. And what functional form are you using?

kerkphil commented 8 years ago

@rickecon This is just eyeballing the function vs the data for ETR. Both reach approximately the same minimum at the x=y=0 point, both assymptote to about the same levels on the x and y axes. But, of course, a formal fitting needs to be done. I used a CES function with the parameter values indicated. My guess is that these would be good starting values for a formal fit via minimization of squared deviations.

rickecon commented 8 years ago

In my preliminary horse race between the Cobb-Douglas and the general CES above, I am getting lower weighted sum of squared deviations from the Cobb-Douglas specification (e.g. 8390 Cobb-Douglas versus 8491 general CES). My intuition for this is that, even though Cobb-Douglas is a nested version of the more general CES function, the minimizer is not able to find a global minimum in the latter case because the elasticity parameter performs the same curvature role as do the coefficients in the ratios of polynomials. We may still have an identification problem in the Cobb-Douglas functional form, but it is to a lesser degree than the general CES function. I think this is evidenced by the lower weighted sum of squared deviations.

jdebacker commented 8 years ago

What if you start the CES off with epsilon=1 an the min values from the CD estimation?

I'll admit I do like the additional flexibility of the CES and am a bit surprised there is trouble finding a global min since epsilon should be affecting the function in a significantly different way than the ratio of polynomials. E.g. you can think about the ratios defining the edges (i.e. what the tax functions look like when x or y=0). Epsilon will tell you about the surface connecting those two edge, and the amount of curvature across that surface.

rickecon commented 8 years ago

@jdebacker @kerkphil . OK. After much testing, I think the verdict is that that the general CES is not identified, even if we zero out the coefficients on the squared terms in the polynomials. I think this is evidence for using the Cobb-Douglas aggregator in which we estimate the max_x and max_y terms as well as the coefficients on the squared terms in the polynomials--specification 2 (CD). The table below shows the sum of squared errors for the eight different specifications.

Spec.	maxes	sq. trms	wSSE	Spec.	maxes	sq. trms	wSSE
1 (CD)	No	Yes	8525.5	1 (CES)	No	Yes	9935.5
2 (CD)	Yes	Yes	8418.1	2 (CES)	Yes	Yes	10075.5
3 (CD)	No	No	8965.7	3 (CES)	No	No	14607.7
4 (CD)	Yes	No	8654.4	4 (CES)	Yes	No	10214.0

Below are pictures of each of the fitted curves. But it is worth noting that the general CES function was always poorly identified. Evidence is that, the final converged weighted sum of squared deviations would always change in that specification whenever I would change the starting values. This was not the case with the Cobb-Douglas. Also, the general CES did not always converge. I could get it to converge by testing different starting values, but this was not a robust specification. This is unfortunate, as I really liked the shape of specification 1 (CES) as is shown below. But I like the fit of 2 (CD) as well.

Figure 1 (CD), Cobb-Douglas, not estimating the max_x and max_y, and including coefs on sq. terms. etr_age_43_year_2016_vspred1

Figure 1 (CES), General CES, not estimating the max_x and max_y, and including coefs on sq. terms. etr_age_43_year_2016_vspred2

Figure 2 (CD), Cobb-Douglas, estimating the max_x and max_y, and including coefs on sq. terms. etr_age_43_year_2016_vspred3

Figure 2 (CES), General CES, estimating the max_x and max_y, and including coefs on sq. terms. etr_age_43_year_2016_vspred4

Figure 3 (CD), Cobb-Douglas, not estimating the max_x and max_y, and excluding coefs on sq. terms. etr_age_43_year_2016_vspred5

Figure 3 (CES), General CES, not estimating the max_x and max_y, and excluding coefs on sq. terms. etr_age_43_year_2016_vspred6

Figure 4 (CD), Cobb-Douglas, estimating the max_x and max_y, and excluding coefs on sq. terms. etr_age_43_year_2016_vspred7

Figure 4 (CES), General CES, estimating the max_x and max_y, and excluding coefs on sq. terms. etr_age_43_year_2016_vspred8

rickecon commented 8 years ago

I estimated the ETR, MTRx, and MTRy functions with the Cobb-Douglas specification (2-CD, Figure 4 above), and the functional form performed very well. It was reliable and rubust, in passed the eye test, and it was very flexible. Below are six plots of the fit--two for each ETR, MTRx, and MTRy--to the data for 43-year-olds in 2016.

The first figure in the pair plots the full range of income. However, 99% of the data are much closer to the origin. So the second plot shows a zoom in of the fit for labor income and capital income less than $800,000. The fit is very good close to the origin where most of the data lie, and it passes the eyball test further out from the origin where things are more sparse.

Full income range predicted ETR vs. ETR data fig1

Truncated income range (<$800K) predicted ETR vs. ETR data fig2

Full income range predicted MTRx vs. MTRx data fig3

Truncated income range (<$800K) predicted MTRx vs. MTRx data fig4

Full income range predicted MTRy vs. MTRy data fig5

Truncated income range (<$800K) predicted MTRy vs. MTRy data fig6

rickecon commented 8 years ago

This functional form also works for a flat tax. As a test, I set all the ETR data to a constant 0.15. I used the same starting values for the parameters as in the previous runs, and the estimation came up with the following fit. flattaxfit

rickecon commented 8 years ago

@jdebacker @kerkphil . I just ran the new tax functions (age_specific=True) on OG-USA with S=80 and J=7. The steady-state solved, but the max resource constraint error in the last period of the time path was 2.7e-02. In trying to diagnose the problem, I am ran the following diagnostics.

WORKS. Set all ETR's, MTRx's, and MTRy's equal to their respective averages. I ran this with S=40 and J=3 and everything solved, both SS and TPI. All the plots of the time paths look right as well.
DOESN'T WORK. Set ETR's to average ETR, and limit MTRx's to be functions only of labor income and limit MTRy's to be functions only of capital income with S=40 and J=3. This one solves the SS nicely, but the final resource constraint error has not converged (-8.9e-03). Plots of the time paths have the same jump as the full run above. Not good.
WORKS. Let ETR's be functions of both labor income and capital income, but restrict MTRx's and MTRy's to be their respective averages with S=40 and J=3. Everything solved, both SS and TPI. All the plots of the time paths look right.
DOESN'T WORK. Let ETR's be functions of both labor income and capital income, but restrict MTRx's and MTRy's to be functions of labor income and capital income, respectively, with S=40 and J=3. This one solves the SS nicely, but the final resource constraint error has not converged (-8.1e-03). Plots of the time paths have the same jump as the full run above. Not good.

The conclusion is that the MTRx and MTRy functions being anything more than estimated constants is causing problems. Back to debugging land.

rickecon commented 8 years ago

The next test is to see if there is some interaction between the MTRx functions and the MTRy functions.

WORKS. Set ETR's and MTRx's equal to their respective averages, and let MTRy be a function of only capital income for S=40 and J=3. SS and TPI solve and all plots look good.
DOESN'T WORK. Set ETR's and MTRy's equal to their respective averages, and let MTRx be a function of only labor income for S=40 and J=3. This one won't even solve the SS. It starts given NaN's for the Euler error values after about 10 iterations.
WORKS. Set ETR's and MTRx's equal to their respective averages, and let MTRy be the full function of labor income and capital income for S=40 and J=3. SS and TPI solve and all plots look good.
WORKS. Set MTRx's equal to the average MTRx, and let ETR's and MTRy's be the full functions of labor income and capital income for S=40 and J=3. SS and TPI solve and all plots look good.

So it looks pretty likely that the offending culprit is somewhere in the MTRx functions.

rickecon commented 8 years ago

@jdebacker. I am really stumped. I ran the following tests, and none of them worked. In all cases, the steady-state solved and TPI finished, but that last period resource constraint max error was too large (-1.06e-03, -1.29e-02, and -8.09e-03, respectively).

DOESN'T WORK. Set MTRx's equal to the age 42 (approximately, 12th element, index 11, in the age dimension) parameter values from each separate year of the budget window so that MTRx was not a function of age, but was a function of capital and labor income and time, according to the Cobb-Douglas specification. I let ETR's and MTRy's be the full functions of labor income and capital income for S=40 and J=3. SS solves and TPI solves, but the max resource constraint error in the last period of the TPI is -1.06-e03.
DOESN'T WORK. Set MTRx's equal to the age 42 (approximately, 12th element, index 11, in the age dimension) parameter values from 2016 so that MTRx was not a function of age or time, but was a function of both capital and labor income, respectively, according to the Cobb-Douglas specification. I let ETR's and MTRy's be the full functions of labor income and capital income for S=40 and J=3. SS solves and TPI solves, but the max resource constraint error in the last period of the TPI is -1.29-e02.
DOESN'T WORK. Made MTRx's a function of only labor income. I let ETR's and MTRy's be the full functions of labor income and capital income for S=40 and J=3. SS solves and TPI solves, but the max resource constraint error in the last period of the TPI is -8.09-e03.

Really stumped on this. The only specifications that work are with MTRx equal to its average.

jdebacker commented 8 years ago

@rickecon , I'll take a look. Seems like it's an issue with how some MTRx parameters are being passed/entered into functions.

rickecon commented 8 years ago

@jdebacker @kerkphil . BOOM! Found it. Everything now runs for S=40 and J=3 with the SS and TPI solving to a very high precision using the full ETR, MTRx, and MTRy functions. Man, that sucked. It was a very pernicious coding error--a small transposition of function arguments in a key piece of code.

The wage (w) and savings (b) arguments were transposed in the tax.MTR_labor() function call in line 392 of household.py. This is what was causing all the trouble. I will run the S=80 and J=7 model overnight to make sure that it runs well. Then I will submit this PR. Thanks for all the help. This may solve a lot of the problems that we were seeing.

jdebacker commented 8 years ago

@rickecon - great find! It looks like that error has been with us for a while.

rickecon commented 8 years ago

S=80 and J=7 works beautifully now with the full ETR, MTRx, MTRy functions of age, time, labor income, and capital income. Here are some pictures of the equilibrium time paths of aggregate variables. figi figk figl

rickecon commented 8 years ago

Finshed via PR #238.

PSLmodels / OG-Core

Estimate tax functions from microdata that are monotonic in labor and capital income #234