Variables with recursive definitions

jessegrabowski commented 2 years ago

Hello,

Thank you for releasing this wonderful tool. I've been experimenting with trying to express different models within the sequence_jacobian framework, and came across a case I'm not quite sure how to deal with. The model comes from Schmitt-Grohé and Uribe 2006, "Optimal Simple and Implementable Monetary and Fiscal Rules". In it, the authors use a Calvo rule and Dixit-Stiglitz aggregator to obtain the usual phillips curve, but rather than proceeding directly to linearization, they introduce three auxiliary variables which allow them to recursively define the Phillips curve and maintain the non-linear structure. It is not obvious to me how such a recursive definition should be implemented in sequence_jacobian. A naive implementation might look like this:

@simple
def final_goods_firm(Y, RHS, LHS, pi_star, pi, mc, beta, psi, theta, λ):
    LHS = λ * Y * mc +\
        beta * theta * pi(+1) ** ((1 + psi) / psi) * LHS(+1)
    RHS = λ * Y * pi_star +\
        beta * theta * pi(+1) ** (1 / psi) * pi_star / pi_star(+1) * RHS(+1)

    phillips_curve = (1 + psi) * LHS - RHS

    return phillips_curve, LHS, RHS

@simple
def firm_aggregation(Y_j, pi_star, pi, nu_p, psi, theta):
    nu_p = (1 - theta) * (1 / pi_star) ** ((1 + psi) / psi) +\
        theta * pi ** ((1 + psi) / psi) * nu_p(-1)

    Y = Y_j / nu_p

    return Y, nu_p

Obviously this doesn't work -- it induces a cycle in the DAG, because the same variable is included among the inputs and the outputs.

Surprisingly, if i just replace all the leads and lags with auxiliary variables, this sort of works -- the winding condition number is 0, solve_jacobian runs, and I can produce IRFs with the correct dynamics. It's definitely not implemented correctly, though: I needed to manually provide the steady state (the built-in solver threw a "no complex numbers allowed" error), and non-linear solver diverges to infinity after a few steps.

My intuition is that computing these recursive variables should be possible within the framework, but I am missing something. They seem similar to what is being computed in a het block, except the expectations are to be taken over the exogenous shocks rather than the distribution of agents. It seems like it should be possible to use the jacobians of input variables with respect to exogenous shocks and, starting from LHS = LHS(+1) = LHS(ss), work backward for a truncated sequence T, much like the het block backward pass. But it's equally possible I don't understand what I'm talking about.

I'm going through the code base to try to understand how everything works and how it might apply to this case, but I would appreciate some pointers to get going in the right direction. Perhaps this has a simple solution I'm just too dense to see?

bbardoczy commented 2 years ago

Thanks for posting the issue here so that others can learn from it, too! I'll look into it when I can. My best guess is next week. Cheers!

jessegrabowski commented 2 years ago

Thanks for replying, let me know if you have any hints on how to approach the problem in the meantime. I am working my way though translating the old version example notebooks to the new API and trying to understand how things work under the hood. Hopefully I can contribute meaningfully to solving my own problem.

bbardoczy commented 2 years ago

Translating the notebooks is something I can help you right away. I just pushed a few new commits to the master branch. RBC, Krusell-Smith, and one-asset HANK are translated already.

jessegrabowski commented 2 years ago

Thanks -- the updated notebooks are great, and in fact have an example of recursive variables using solved blocks. In two_asset, for example, price is defined recursively as follows:

@solved(unknowns={'p': (5, 15)}, targets=['equity'], solver="brentq")
def arbitrage_solved(div, p, r):
    equity = div(+1) + p(+1) - p * (1 + r(+1))
    return equity

So this basically answers my question!

But I still think there is a more elegant solution, and one which perhaps doesn't require a numerical solver. It is possible to form a Jacobian for the recursive relationship p(+1) = p* (1 + r(+1)) - div(+1), and the structure of such a Jacobian is very regular, with entry (t,s) equal to a product of all the single-step derivatives from t to s (leads, upper triangle) or s to t (lags, lower triangle).

To implement a "recursive block", I think all that is needed is a wrapper to declare which variable has a recursive relationship, so that internally something clever can be done to let the DAG know this variable in the input and the output is being handled in a special way, plus some internal naming things to make sure you can have the same variable name in both the input and the output, e.g. to access recursive_block.jacobian(...)['p']['p'].

I'll play around with it in my free time and let you know if I come up with anything. Thanks again for the updated notebooks.

bbardoczy commented 2 years ago

Hi Jesse, Let me make sure we're on the same page.

Simple block paradigm.

The set of inputs and set of outputs must be disjoint.
You can define a new variable within the block and then use it immediately in a definition of a new variable within the same block. Like nu_p in your firm_aggregation block. However, you can only use it's contemporaneous value. So, Y = Y_j / nu_p is valid, but Y_j / nu_p(-1) is not. Of course, you may use leads/lags of nu_p in simple blocks down the DAG.

Handling recursive equilibrium conditions

Even though simple blocks don't support the kind of recursive definition that you attempted, they are sufficient to represent any DSGE model with no or trivial heterogeneity. You just have to include all such variables as unknowns, and use the corresponding dynamic equations as targets (i.e., express their residual, which must be 0 in equilibrium).
Having said that, it's preferable to keep the number of unknowns small (a large $H_U$ slows things down). As you correctly pointed out, there's enough information "in the block" to solve for these recursive variables locally. This is where solved blocks come into the picture.

Solved block paradigm

One application is to resolve recursive definitions locally. More generally, you can use them to resolve any implicit relationship.
Our convention---set of inputs and set of outputs must be disjoint---applies in this case as well. Our implementation of key methods depends on this (topological sort, forward accumulation of Jacobians etc.).

Picture solved blocks as a wrapper around an inner block or blocks. Take this example:

@solved(unknowns={'p': (5, 15)}, targets=['equity'], solver="brentq")
def arbitrage_solved(div, p, r):
equity = div(+1) + p(+1) - p * (1 + r(+1))
return equity

The inner block is a

@simple
def arbitrage_solved(div, p, r):
equity = div(+1) + p(+1) - p * (1 + r(+1))
return equity

In the inner block, div is an input, but not an output. In the outer block, div is an output, but not an input.

Computing the Jacobian of solved blocks is pretty efficient, and completely analogous to how we deal with the model Jacobian. Let $H(div; p, r)=0$ be the algebraic representation of the inner block. Then, $J^{div, r} = -H_{div}^{-1} H_r.$
Evaluating solved blocks relies on numerical solvers. These typically take just a few of iterations to converge. The recursive block you envisioned could very well offer some improvement here. However, evaluating the inner simple blocks is very quick (we do exploit sparsity there), so this is not something that we were planning to do.

Hope this helps!

Best, Bence

jessegrabowski commented 2 years ago

Bence,

Indeed, it does help! I understand better how to implement these sorts of blocks into the framework. I'll mark this issue as closed.

One thing that excites me about this framework for solving rational expectation models generally, not just heterogeneous agent models, is that it is end-to-end differentiable. The QZ decomposition, on the other hand, is not. It would not take much to implement this in any kind of computational graph framework, which would in turn open doors during estimation; specifically the use of HMC samplers for Bayesian estimation. It seems to be the numerical solvers in the solved block are the only place where this end-to-end differentiability isn't trivially true.

shade-econ / sequence-jacobian

Variables with recursive definitions #5