econ-ark / HARK

Heterogenous Agents Resources & toolKit
Apache License 2.0
329 stars 198 forks source link

Endogenous shocks #996

Closed sbenthall closed 3 years ago

sbenthall commented 3 years ago

Continuing from here

Further, I do not see any technical reason to structure things in such a way that there can only be one draw of shocks per period.

As far as I know, this suggestion amounts to a new feature request, as there's nothing in HARK that supports this kind of model yet.

  1. This is completely possible in current HARK. You had some prior exchanges with Matt about the fact that the current structure allows arbitrary variation in the "solve_one_period" method from one period to the next. That means that any aspect of the problem -- including shocks -- can be different. The user just has to hand craft the solve_one_period sequence accordingly.

  2. We are talking about the structure of a new architecture, so it's not really right to think about it as a "new feature request." It's about designing the new architecture in a flexible way that permits users to do what they want.

There are a number of different ideas here. Trying to tease them out:

In all other models, (and currently all the models in Dolo, I believe), all shocks are exogenous shocks, meaning that they do not depend on any agent's control decision. I believe Dolo exploits this technically by computing the exogenous shocks separately from the agent's behavior. You seem to be describing an endogenous shock, where a random variable depends on an agent's decision.

Well, the shock itself can be conceived of as occurring whether or not the agent exposes themselves to it. It's like "do I invest in stocks?" The point is that if I do NOT have any stock investments, then I don't have to do the expensive numerical integration across all the possible realizations of the rate of return on stocks. The stock market still exists, and has its exogenous shock, it just doesn't affect me.

I see two separate questions here which it is tempting to conflate into one:

  • What is the right way to model this kind of situation?
  • What is the most efficient way to implement the simulation and solution code for this kind of situation?

In my view, it's important to answer the first question before the second because "premature optimization is the root of all evil".

Agreed. The only potential argument I know of for restricting the shocks to occur only at one stage is that it might be faster -- that counts as "premature optimization" and could result in considerable evil.

Addressing the first question (how to model this?), I would propose:

  • Consider stochastic draws to be a subtype of transition equation
  • Model an endogenous shock as a sample from a conditional probability distribution, where the preceding model state and controls variables are conditions on the right hand side of the equation.

Agreed.

mnwhite commented 3 years ago

TLDR: You actually do want to precompute the shocks for all possible choices that all agents could make, and it's not that computationally burdensome.

In an estimation setting, shocks should always be precomputed and "locked in" to avoid oddities in the objective function from variation in the RNG. Suppose we're trying to estimate some model parameters by minimizing objective function f on parameters theta (many parameters, let's say). To have any hope of minimizing f numerically, we want it to be continuous and smooth, so that our optimization method can identify "which way is down". Jags and bumps are bad, but they're inevitable in any model with discrete outcomes that are endogenous.

To see this, consider a tiny epsilon change in one parameter that has very tiny (marginal) effects on the policy functions, but this small difference is enough to cause one simulated agent (out of 100,000, say) discretely change their behavior in period t=3 (say, from going to school to instead entering the labor market). This agent's entire history after t=3 will now be discretely different, as they'll have one less period of schooling and one more period of labor experience. This is guaranteed to produce a small discontinuity in the objective function... and there are many of those.

That discontinuity will be small, but consider what would happen if instead of changing only the behavior of that agent at this discontinuity, we instead changed the behavior of all of the agents. This would happen if you didn't pre-specify all the possible shocks that each agent could get in each period. Suppose there was a set of "grade shocks" and a set of "wage shocks"; agents in school get grade shocks, agents in the labor market get grade shocks. If the code were set up so that it checked how many agents are in school and drew grade shocks for them, then checked how many agents are in the labor market and drew wage shocks for them, then one agent switching from school to labor would change these numbers, and the shocks assigned to all other agents would be offset by one.

It saves a lot of difficulty and complications later (and makes the objective function much smoother) if all possible shocks are pre-drawn at the very beginning, and then we grab the ones that are actually necessary as simulation happens.

On Tue, Apr 6, 2021 at 10:25 AM Sebastian Benthall @.***> wrote:

Continuing from here https://github.com/econ-ark/HARK/issues/620#issuecomment-813603497

Further, I do not see any technical reason to structure things in such a way that there can only be one draw of shocks per period. Maybe a good technical case for this exists and I'm missing it, but if, for example, one of the decisions that a person makes in the period is whether or not to take some risk, or which of several kinds of risks to take, it would be wasteful to precompute shock distributions for all possible choices if the person might not actually make them.

As far as I know, this suggestion amounts to a new feature request, as there's nothing in HARK that supports this kind of model yet.

There are a number of different ideas here. Trying to tease them out:

In all other models, (and currently all the models in Dolo, I believe), all shocks are exogenous shocks, meaning that they do not depend on any agent's control decision. I believe Dolo exploits this technically by computing the exogenous shocks separately from the agent's behavior.

You seem to be describing an endogenous shock, where a random variable depends on an agent's decision.

I see two separate questions here which it is tempting to conflate into one:

  • What is the right way to model this kind of situation?
  • What is the most efficient way to implement the simulation and solution code for this kind of situation?

In my view, it's important to answer the first question before the second because "premature optimization is the root of all evil".

Addressing the first question (how to model this?), I would propose:

  • Consider stochastic draws to be a subtype of transition equation
  • Model an endogenous shock as a sample from a conditional probability distribution, where the preceding model state and controls variables are conditions on the right hand side of the equation.

It may be that the mostly closely analogous existing model in HARK is the ConsMarkovModel. See issue #956 https://github.com/econ-ark/HARK/issues/956 for a proposal about encapsulating the dependency of income shock shocks on a Markov process with a conditional probability distribution.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/econ-ark/HARK/issues/996, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADKRAFOYA3PRYV7LI7W5ZU3THMKXPANCNFSM42O3DRAA .

sbenthall commented 3 years ago

@llorracc I see that you've responded to my issue by editing my original post with your responses inline. That's quite odd. Now it looks like I'm arguing with myself.

  1. arbitrary variation in the "solve_one_period" method from one period to the next. That means that any aspect of the problem -- including shocks -- can be different. The user just has to hand craft the solve_one_period sequence accordingly.

It is certainly always possible to hand craft something that does exactly what you want it to do.

But if you need to hand craft something to make it happen, it's not quite accurate to say that the current HARK code supports the functionality.

2. We are talking about the structure of a new architecture, so it's not really right to think about it as a "new feature request." It's about designing the new architecture in a flexible way that permits users to do what they want.

I don't know how to address this.

I'm talking about new features to an existing software library, or ways to refactor it.

Are you suggesting a complete, new rewrite of HARK? Like a HARK 2.0? I would be happy to assign this ticket to the HARK 2-onward milestone, though I think there are interesting ways to do endogenous shocks with less ambitious changes.

The stock market still exists, and has its exogenous shock, it just doesn't affect me.

If the shock is truly exogenous, then for the reasons Matt's mentioned among others, I don't think it hurts much to sample it in forward simulation.

I understand that you are mainly interested in how to write solvers though.

I guess I would need to see a specific problem to understand whether and how the numerical integration costs could be reduced.

The only potential argument I know of for restricting the shocks to occur only at one stage is that it might be faster -- that counts as "premature optimization" and could result in considerable evil.

I don't really understand what you're arguing for, sorry.

If the shocks are exogenous, then I suppose they could show up at any stage in which they were prior to a control that depends on them.

But are you saying that the same shock should be drawn potentially multiple times during the same period?

I think that way of abstracting out the problem creates more problems than it solves. Alternatively, one could:

I felt like these tables laying out model variables and their relationships via equations were great for establishing common understanding of terminology:

https://github.com/econ-ark/HARK/issues/991#issuecomment-812644798

But what you've been suggesting since seems to be rejecting that attempt to systematize the problem. Why is that?

llorracc commented 3 years ago

On Tue, Apr 6, 2021 at 3:55 PM Sebastian Benthall @.***> wrote:

@llorracc https://github.com/llorracc I see that you've responded to my issue by editing my original post with your responses inline. That's quite odd. Now it looks like I'm arguing with myself.

Oops, I was trying to copy and paste your text and respond as in email.

  1. arbitrary variation in the "solve_one_period" method from one period to the next. That means that any aspect of the problem -- including shocks -- can be different. The user just has to hand craft the solve_one_period sequence accordingly.

It is certainly always possible to hand craft something that does exactly what you want it to do.

But if you need to hand craft something to make it happen, it's not quite accurate to say that the current HARK code supports the functionality.

That's not really right. The whole scheme of HARK is to say "there's really only ONE thing that needs to be handcrafted for every model: the solve_one_period method. The rest of HARK is the infrastructure that takes care of all the rest of the housekeeping of Bellman problems and lets you concentrate all of your attention as a modeler on the core of your mathematical problem, which is handcrafting your solve_one_period setup."

  1. We are talking about the structure of a new architecture, so it's not really right to think about it as a "new feature request." It's about designing the new architecture in a flexible way that permits users to do what they want.

I don't know how to address this.

I'm talking about new features to an existing software library, or ways to refactor it.

Are you suggesting a complete, new rewrite of HARK? Like a HARK 2.0? I would be happy to assign this ticket to the HARK 2-onward milestone, though I think there are interesting ways to do endogenous shocks with less ambitious changes.

The stock market still exists, and has its exogenous shock, it just doesn't affect me.

If the shock is truly exogenous, then for the reasons Matt's mentioned among others, I don't think it hurts much to sample it in forward simulation

I understand that you are mainly interested in how to write solvers though.

Right. I was mostly talking about solvers.

I guess I would need to see a specific problem to understand whether and how the numerical integration costs could be reduced.

The only potential argument I know of for restricting the shocks to occur only at one stage is that it might be faster -- that counts as "premature optimization" and could result in considerable evil.

I don't really understand what you're arguing for, sorry.

If the shocks are exogenous, then I suppose they could show up at any stage in which they were prior to a control that depends on them.

Right.

But are you saying that the same shock should be drawn potentially multiple times during the same period?

No.

I think that way of abstracting out the problem creates more problems than it solves. Alternatively, one could:

  • introduce more shock variables, sampled once.
  • have the random distribution of the shock variable create all the information needed by the multiple draws.

I felt like these tables laying out model variables and their relationships via equations were great for establishing common understanding of terminology:

991 (comment)

https://github.com/econ-ark/HARK/issues/991#issuecomment-812644798

But what you've been suggesting since seems to be rejecting that attempt to systematize the problem. Why is that?

I'm happy with that table as a description of that particular problem, which happens to be one in which the shocks are all (in sequence) at the beginning. But nothing about the table specifies that we ought to impose a restriction that shocks must always happen at the beginning. For the forward simulation part of things there may be reasons for executing the simulation that way, but I can't see a reason to impose it as a requirement. Indeed, it seems to me that it would complicate our lives considerably to impose it.

Somehow I feel that there is some way in which we are talking past each other. Maybe if you could articulate a reason why you think it would be useful to require problems to be structured in such a way that all shocks must be realized before any other steps, that would help me pinpoint what our communication difficulty might be.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/econ-ark/HARK/issues/996#issuecomment-814398407, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKCK74IP4NXBXCFR52AARTTHNRJTANCNFSM42O3DRAA .

--

sbenthall commented 3 years ago

Somehow I feel that there is some way in which we are talking past each other. Maybe if you could articulate a reason why you think it would be useful to require problems to be structured in such a way that all shocks must be realized before any other steps, that would help me pinpoint what our communication difficulty might be.

Ah, indeed we are talking past each other. I don't think it is useful to require problems to be structured in such a way that all shocks must be realized before other steps. I apologize for the confusion.

I think a point we agree on is that: for each shock variable, it will be drawn at most once per period, somewhere in the period.

I see now that we have been "in violent agreement", each misinterpreting the other as being opposed to this point of agreement.

I'll close this ticket now.

That's not really right. The whole scheme of HARK is to say "there's really only ONE thing that needs to be handcrafted for every model: the solve_one_period method. The rest of HARK is the infrastructure that takes care of all the rest of the housekeeping of Bellman problems and lets you concentrate all of your attention as a modeler on the core of your mathematical problem, which is handcrafting your solve_one_period setup."

This discussion is a bit philosophical now, and out of scope for the ticket....

But where I think we agree is that it would be even better if HARK supported model building in configuration the way Dolo does it, which is by letting the modeler define a solution in terms of known algorithms with a clearly defined Bellman form equation, or something carrying comparable information.