New monotone tax function estimation and Euler equation solution

rickecon commented 4 years ago

We need a new functional form for our tax function estimation tau(x,y) that is still continuous and monotonically increasing in both labor income x=w*n and capital income y=r*b( first derivatives are everywhere positive) but relaxes the condition that the second derivatives d2tau/dx2<0 and d2tau/dy2<0 be everywhere negative.

The current functional form for our three types of tax functions--effective tax rates, marginal tax rates on labor income, and marginal tax rates on capital income--is the following 12-parameter function. This tax function is detailed in DeBacker, Evans, and Phillips, "Integrating Microsimulation Models of Tax Policy into a DGE Macroeconomic Model," Public Finance Review, 47:2, pp. 207-275 (Mar. 2019).

This functional form satisfies the constraint that the first derivatives are everywhere positive and the second derivatives are everywhere negative. It takes the standard negative exponential shape in either the x or the y direction holding the other variable constant.

With two recent tax reforms--the Biden Plan from Feb. 2020 and a universal basic income (UBI) proposal--that this function with its restrictions is not able to model well the characteristics of reforms that create "lumps" in the resulting tax data.

We need to maintain the following characteristics (constraints) on our tax function of choice:

tau(x,y) must be monotonically increasing in both x and y. This condition ensures that a unique solution exists for the household Euler equations.
tau(x,y) should be able to take on negative values for small values of x and y. Policies like the Earned Income Tax Credit (EITC) and UBI and other tax credit programs create negative effective tax rates and sometimes negative marginal tax rates at low income levels. Because of the monotonicity requirement in (1), these negative rates can only exist in our functional forms at the lowest income levels.
tau(x,y) should be a function of both labor income x and capital income y in such a way that the two variables can interact to determine the tax rate tau. This is something that we found to be empirically true in the Tax-Calculator data from get_micro_data.py, which is documented in the DeBacker, et al (2019) paper.
The function should approach an asymptote as x or y gets large.

Characteristics to change:

We need to allow for the second derivatives of the tax function d2tau/dx2 and d2tau/dy2 to be both negative and positive. That is, the function should allow for multiple inflection points.
As either x or y gets small, the respective second derivatives can be positive or negative and the asymptote can be either negative infinity or some finite number. This allows for policies like the EITC or UBI, which flatten out effective tax rates at the low end in addition to making them negative.

Potential functions to try

Monotone linear piecewise regression over N areas in two dimensions (x, y). I know how to do this in one dimension. I don't know how to do it in two dimensions.
Monotone cubic spline regression over N areas in two dimensions (x,y). I found a Wikipedia article showing how to constrain a univariate cubic spline interpolant (passes through all the points) to be monotone. However, it is probably a big jump to do that in two dimensions and to do it as a regression (passes through the average or some measure of fit of many points).

This function needs to be able to have its parameters estimated with data, then reproduced to get the predicted values in OG-USA. It could be nonparametric, although everything we have used up to now has been parametric (DeBacker-Evans-Phillips, Gouveia-Strauss, linear, average (constant)).

cc: @jdebacker @kerkphil

rickecon commented 4 years ago

I received a question from @tomrod asking to clarify the proposed change in point 4 that allows for different asymptotes as x --> 0 or as y --> 0. The DeBacker-Evans-Phillips functional form in OG-USA has the negative exponential shape (e.g. log utility) such that the tax rate goes to a very negative number or to -infinity as either x --> 0 or as y --> 0. And the second derivatives of the function are both negative for small values of x and y (as x --> 0 or as y --> 0). See the yellow and green lines below, labeled "DEP".

Compare_ETR_functions

The univariate Gouveia-Strauss functional form in OG-USA and used broadly in the literature can have an asymptote that is less than -infinity as x --> 0 or as y --> 0 and the second derivatives can be positive. See the pink line in the image above. @khakieconomics suggested using normal CDF's which enforce this property, as would an arctangent function.

I see three shortcomings with the Gouveia-Strauss, normal CDF, and arctangent functional forms.

It is univariate, a function of only total income, and does not allow for multivariate versions.
It only allows for the finite asymptote as x --> 0 or as y --> 0 with second derivatives that are positive. It does not allow for the -infinity asymptote case with negative second derivatives.
It only allows for one inflection point (where the second derivative changes sign). That is better than the 0 inflection points in DEP, but reforms like the Biden plan that increases taxes at the upper end of the distribution might need up to three inflection points.

MaxGhenis commented 4 years ago

Here's how Tibshirani et al (2011) describe isotonic regression:

The Pool Adjacent Violators Algorithm (PAVA) solves this; I found this description of PAVA helpful.

rickecon commented 4 years ago

@MaxGhenis. The Tibshirani, et al (2011) paper about isotonic regression is really an interpolant that passes through all the points except when the constraint beta_1<=beta_2<=...beta_n. You see this because there are n data points {y_n} and n estimated betas {beta_n}. I found some JavaScript code in this Wikipedia article that shows how to make the monotone interpolation that you describe in your first equation and which monotone estimation I think is described in your optimization problem. But I want a regression version of this problem. I want a functional form where the number of betas is less than the number of data points.

Another way to see my point is to think of the univariate case, like the first one you mention in the previous comment. Each of the betas defines a line between adjacent betas. Two regression versions of this problem are defined in the two alternatives I list at the beginning of this issue (piecewise linear regression and cubic spline regression). There is good code to do this in the univariate case for an interpolation problem. It would take some easy tweaking for the regression in the univariate case. I think it is much more complicated in the bivariate case.

tyleransom commented 4 years ago

What you drew in step 5 looks almost like a set of piecewise log functions. But I guess this function needs to be everywhere twice differentiable, so that wouldn’t work? I’m wondering if there is a way to smooth out the transition points.

tomrod commented 4 years ago

Would some form of wave in the positive orthant suffice, followed by controlled behavior when x or y fall below 0?

MaxGhenis commented 4 years ago

@rickecon could you share sample data to work through?

I only cited the Tibshirani paper for the description of isotonic regression, agree that their approach is unsuitable.

rickecon commented 4 years ago

Aaron Hedlund suggested looking at Mariacristina De Nardi's and Gustavo Ventura's use of parameterized tax functions.
Len Burman suggested higher order polynomials which are extremely flexible. But the higher order the polynomial, the harder to estimate the function given the constraints (monotonicity and an asymptote as x or y get large). The polynomial coefficients might be underidentified, making estimation have multiple solutions. Take a cubic (third order) polynomial example. The first derivative is a second order polynomial. Enforcing that a parabola be everywhere positive on the range of the polynomial is doable but hard. And it gets even harder as the order of the polynomial increases. Len suggested a bounding strategy for the polynomial functions that maintains continuity of making the function a(x)p(x) + (1-a(x)B, where a(x) is the weight, p(x) is the polynomial, and B is a constant. a(x) is a continuous differentiable function on [0,1]. And lim a(x) = 0. E.g., a(x) = 2/(1+exp(kx)), where k>0, has a(0) = 1 and a(inf) = 0. You could apply the a similar bounding method for negative values. This latter approach is using the logistic function is what we already to in the DEP functional form in OG-USA. The weakness of that approach is that it doesn't fit well the tax policies that flatten out the low income end of the tax rate function (like a normal cdf or arctan; think EITC or UBI).
@baogorek suggested adding a penalties to 1st and 2nd derivatives using splines via optim in R. This is probably the right way to do a high order polynomial with constraints.

rickecon commented 4 years ago

@tyleransom . I don't think the function has to be everywhere differentiable. It just has to be continuous. The derivatives need not be continuous.

For example, back in the day, Ken Judd taught me how to convexify an optimization problem with N finite segments of a function. Think single period static consumption optimization problem with a step function for marginal tax rates over N income brackets. The problem is convex on each of those income brackets. You convexify the problem by, instead of optimizing over the hard problem and choosing just c, set up the problem to choose N weights {w_n} and N consumptions {c_n} on each bracket to maximize utility. This is analogous to creating lotteries over each income segment. The solution will be to have the weight of the optimal solution equal 1 and all the other weights equal 0. The optimal consumption is the c_n that corresponds to the weight that equals 1.

So piecewise linear might work here. If continuity is valuable, then piecewise cubic spline regression might be the answer.

rickecon commented 4 years ago

@tomrod Can you expand on this wave in the positive orthant? What is the functional form? How do I constrain it to be monotonically increasing and have the property that it asymptotes for large income levels x --> infty and y --> infty?

rickecon commented 4 years ago

@MaxGhenis I really like the Tibshirani, et al (2011) paper. I just need to adjust it to the regression version (fitting an average of points) rather than passing through as many as possible subject to constraints--even though they title their approach regression. This is an approach that I am most optimistic about. My only worry is that it will be an order of magnitude harder in two dimensions. But @khakieconomc 's suggestion of just convolving univariate functions with weights is a good one. That is essentially what we do with our DEP functional form.

rickecon commented 4 years ago

@MaxGhenis I will post some data later today as a testing ground for these functions. I will take this data from the Tax-Calculator output generated in the get_micro_data.py module of OG-USA. These data will look like the scatterplot in this figure. Time to start experimenting. I will document the progress in this issue. Anyone should feel free to contribute and/or comment.

rickecon commented 4 years ago

Len Burman made an interesting comment about the monotonicity requirement of the tax functions that made me think about this. We decouple the estimation of the marginal tax rate functions from the effective tax rate functions in OG-USA because there are policies that can change effective tax rates and not change marginal tax rates (e.g. certain types of tax credits). The marginal tax rate functions are the only things that show up directly in the Euler equations. As such, I think that means that only our marginal tax rate functions have to have strict monotonicity imposed. And I think we could get away some nonmonotonicity in the effective tax rate function that only enters into the budget constraint. @jdebacker does this sound correct to you?

tomrod commented 4 years ago

Thinking more in a design in the f(x, y) space that satisfies what you're looking to do. Picture of (x, z) projections of what you're looking to build as I understand it.

2020-06-03 12 20 24

jdebacker commented 4 years ago

@rickecon I think the monotonicity is important for ETRs as well because they indirectly enter the FOCs through the budget constraint. I can see issues with having a non-convex budget set that would arise if the ETRs were not monotone.

rickecon commented 4 years ago

@jdebacker . I think you're right. Monotonicity is required for ETRs and MTRs.

MaxGhenis commented 4 years ago

Here's R code for multiple isotonic regression, written by Meyer, an apparent expert on isotonic regression: https://www.stat.colostate.edu/~meyer/multipleiso.htm

Meyer (2013) shows this relevant diagram:

Meyer is also a coauthor on Wu et al (2014) "Penalized Isotonic Regression," which is an adaptation we may want to consider.

jdebacker commented 4 years ago

In addition to the choice of functional form, others, such as @jpycroft and @benrpage1, have noted the following considerations in the estimation:

The statistical objective function and distance norm used (e.g., mean square errors or mean absolute difference).
The optimizer used to estimate the parameters (e.g., a global maximization algorithm vs. gradient based methods).
Identifying where the function's fit is better or worse (e.g., are errors larger for high income filers or low income filers?). One could perhaps consider changing the weights applied to errors to target a better fit on particular part of the function.

PSLmodels / OG-Core

New monotone tax function estimation and Euler equation solution #574