Thoughts on covid and SciML lecture outline

jlperla commented 4 years ago

@jstac @thomassargent30 @ChrisRackauckas

Starting with the python version as a reference: https://python.quantecon.org/sir_model.html

My thought is that there could be two lectures here:

Modeling Covid 19/Introduction to SDEs
Policy Uncertainty with Covid 19/Introduction to Scientific Machine Learning

Here are some rough thoughts, none of which I am especially attached to. But I think a good goal is that everything in the "first lecture" are things where the code is probably already complete somewhere so it could be done quickly.

For the first lecture:

We could start by porting a good portion of https://python.quantecon.org/sir_model.html to the OrdinaryDiffEq package, up to the Ending the Lockdown.
We then could add a section introducing the basic structure of a SDE that are initial value problems (i.e. forward equations, no control problems in these lectures)
As an application of that, we add in a reasonable additive aggregate shock to the SIR model. There are a few candidates of the top of my head:
- We could make the i have an additive shock representing people migrating into the region?
- We turn it into an ODE with 4 states... S, I, R, sigma. Where the sigma was the infection rate. Then we make the sigma stochastic.
- Similarly, we could make the beta (i.e. the transmission rate) stochastic, since I think there are aggregate and unpredictable changes to how much people "follow the rules".
- Chris, maybe you have some ideas from reasonable things people have done before and where there is code?
Given the SDE, we show simulating a path of it, as well as ensembles using the diffeq tools. We can focus on the typical quantiles (e.g. mean, 5/95% etc.).
- For example, I think it would be very useful to look at the aggregate deaths to date at some point T in the future? I think in the simple setup there, it is a constant times the R. That parameter could be useful for the second lecture since nobody really knows it.
Finally, I think it is worthwhile for economists to learn the connection between the SDEs for the Langevin equations typically used in the approximation of chemical reactions/etc.
- For that, a discrete number of agents bumping into each other can be approximated by an SDE.
- There, the noise in the SDE captures the discreteness, which could be useful for all sorts of models as well as modeling covid in small groups or isolated regions.

For the second lecture:

There are a lot of opportunities here to introduce tricks from scientific machine learning as well as modeling uncertainty.
We could take the "Ending the Lockdown" example from the python ones as a starting point.
What if we don't really know the a particular parameter (e.g. the beta or sigma). Then what do the scenarios look like for ending the lockdown with a particular policy? We can show ensembles and quantiles for that.
- I think this could be the uncertainty quantification?
Alternatively, lets say that we know the parameters then when should is the earlyist you should end the lockdown if you want to ethe number of deaths will be below XXX.
- This can be done with diffeqflux and auto-differentiating the deterministic ODE?
But what something is stochastic (e.g. the sigma or beta). Then when should is the earliest you should end the lockdown if you want to ensure there is a 95% chance the number of deaths will be below XXX.
- This should be possible with differentiating the SDE solution... though I don't know if that stuff is ready Chris?
Finally, lets come up with a policy tension. Aggregate deaths vs. the unemployment rate.
- We would need some sort of simple unemployment rate as a function of lockdown length.
- Then define a simple loss function of those two at time T. Maybe quadratic.
- Given the loss function of those two at time T, what is the optimal policy for lockdown length? This seems like a DiffeqFlux candidate.
Finally, lets put it together...
- Assume that the loss function is a little more complicated. If the number of infected ever goes above XXX then the medical system is overwhelmed and the cost becomes dramatically higher.
- Now, lets say you have either policy uncertainty in a parameter or you there are aggregate shocks.
- What should your policy of ending the lockdown be in those circumstances? Things are stochastic and the loss function is not a simple quadratic.

thomassargent30 commented 4 years ago

Jesse --

I like what you propose both in terms of teaching about tools and applying them to a nifty class of models. If some simple "uncertainty quantification" could be done in the spirit of "uncertainty outside (but NOT inside) models" it could be a nice introduction to those techniques as well as a way of setting the stage for some models with uncertainty inside the models including from agents inside the models.

ChrisRackauckas commented 4 years ago

Hey,

A good resource to start from as well can be the updated epirecipes https://github.com/epirecipes/sir-julia and the older version http://epirecip.es/epicookbook/chapters/sir/julia . The those show all of the models, but indeed it would be nice to use this space as a larger explanation of them. This tutorial for example (https://github.com/epirecipes/sir-julia/blob/master/markdown/rn_mtk/rn_mtk.md) describes transforming a jump equation (Levy process) into SDE and ODE approximations, but doesn't explain what the equations it's transforming to are like and why those are the equations. So maybe taking that as a starting point and expanding from there might be a good start. And https://docs.sciml.ai/latest/tutorials/discrete_stochastic_example/ is a good reference in the DiffEq documentation on how to define numerical problems related to these equation types and mixtures.

From Markov Models to ODEs and SDEs: the Wide Range of Epidemic Model Approximations

With those in mind, I think maybe even reversing the normal flow could be an interesting way to introduce these models in a deep way:

Start of by fully building up the model as a Poisson processes over time (https://github.com/epirecipes/sir-julia/blob/master/markdown/abm_vector/abm_vector.md or https://github.com/epirecipes/sir-julia/blob/master/markdown/abm_vector/abm_vector_diffeq.md). This one can be explained directly from a few probabilistic assumptions. If the probability of an event happening in a given time interval is constant, then you have Poisson distributions. If you don't have too many events happening, it's approximately Poisson. So you have S+I->2I with some rate, I->R at some rate, take a bunch of Poisson variables in a loop, and that gives you a fully discrete evolving system.
Next we can describe how an approximation to this system is instead of going Poisson with lambda, we can instead directly describe these equations as evolving with those same rates at some dt. This is the mass action laws. This loses the discrete nature of the model and gives you an ODE, but as the number of individuals in the population -> infinity, this becomes an increasingly better approximation where we're talking about "percentages of individuals" as the continuous variables.
Is there an intermediate approximation? A quick calculation of the mean and variances lets you replace the Poisson distributions with normal distributions (central limit, these converge as rate->infinity which is the same as the number of individuals -> infinity). This normal distribution approximation is the SDE via the chemical Langevin equations.
Finally we can take a step back to the discrete process and realize that the Poisson was an approximation too: it was only exactly correct if you only had one jump happening per interval, since if a jump happens the rates change! So then we describe how if you have N Poisson random variables, then the time to the next event is exponential with the waiting time being the sum of the rates. This give's Gillespie's SSA algorithm for exactly handling each jump.

This form would require a bit of probability, but wouldn't require anything crazy like measure theory to explain, so it might serve as a very cool non-rigorous but still mathematically-informed version for understanding how these discrete Markov models relate to the continuous approximations. Also, the fact that the continuous models are inherently worse approximations for small approximations, like when you have only 1 infected person, becomes extremely clear from this presentation, since by design all of the show that the continuous model is a population->infinity limit. The tooling can then be weaved into this story as how you solve the models, but placing the models first probably would be more helpful to people just entering the domain, where the tools are mostly a "okay now solve it" button.

Inverse Problems, Control, and Scientific Machine Learning

I'd clump this all together as the same general parameter estimation concept.

Start by doing the scenarios you describe. "Ending the Lockdown" is great. They key is to demonstrate that the way to figure out how the model is changing under different scenarios is to change parameters, look at the solution, and evaluate how that change effected the solution.
That is then a nice lead-in to control, which uses exactly this same process but more automatic. Start by looking at an ODE model with a desired control, like you say keeping the deaths below XXX. How do you make that happen? Define a loss function where you solve the ODE and look at the difference of the solution to what you want to see. Perform an optimization on the parameters.
We can then demonstrate that what we did was one example of parameter estimation. The classic problem of fitting data is just another choice of a loss function. Then another is your example on fitting parameters of SDEs so that you get a 95% chance the number of deaths is below XX% (the AD tools are ready here). So more generally, data-informed and control-informed versions of the model are just parameter optimizations over loss functions. Many examples here as you describe, and we can go through them all showing that it's all the same basic underlying mathematical concept applied in new ways.
If we want to go the next step towards uncertainty quantification, we can describe how these optimization problems can be converted into a Bayesian formulation where we're searching for random variables for the parameters, have priors on them. The equivalent optimization problem is then given by the MAP estimate, but we can more generally estimate the models via a probabilistic programming language like Turing. We have some demonstrations showing how to do it already (https://github.com/ChrisRackauckas/TuringTutorials/blob/319dd4c37f831c65f2b864167adc98b91b585cec/10_BayesianDiffEq.ipynb), but it would be good to take these optimization problems that we just built up and then get distributional approximations of the solution instead.

We could go into scientific machine learning and estimating incorrect model forms by looking at the problem one of the students from my SciML course did (https://covid19ml.org/), which was that we don't actually know how exposure works, it's very nonlinear, and the linear assumption of making it a rate constant is quite bad, so let's make that part of the model be an unknown function of quarantine and such, and try to fit a neural network to understand how different measures of quarentine give rise to a more complex model. This starts requiring per-county data for the estimation though, which might be muddling the story and so it might be (sadly) better to cut before this.

jstac commented 4 years ago

Hi all, I don't have much to add but I'm very interested. I like the exposition that @ChrisRackauckas suggests -- continuous time MC to ODE approximation. That sounds like a good starting point, and a form of exposition where simulation would be helpful.

I looked this stuff up before getting involved with that first SIR lecture and found it fun and interesting. If someone else kicks off the writing process along these lines I'll try to contribute.

jlperla commented 4 years ago

I like it. My only concern is that since economists are so used to deterministic ODEs using large number approximations that building it up from the finite-agent approximations scratch might be too much in the same lecture as looking at aggregate stochastic shocks.

Perhaps having splitting out a shorter continuous-time markov chain lecture makes sense? We could take out https://julia.quantecon.org/tools_and_techniques/numerical_linear_algebra.html#Continuous-Time-Markov-Chains-%28CTMC%29 clean it up, and then go through the connection of markov chains and the look at the gillespie and langevin elements. I have a feel that there may be useful tools for dealing with search models where the number of agents is large enough for a continuum approximation, but not too small for a deterministic ODE to work.

I am trying to see if I can improve my childcare situation by the beginning of June so hopefully I can think more soon.

jstac commented 4 years ago

+1 on splitting out a shorter continuous time MC lecture. Topics to cover

matrix exponentials
Q matrices and their relationship to Markov matrices (via matrix exponential, Kolmogorov fwd and backward equations)
exponential clock interpretation of Q matrices
ergodicity
applications (inventory dynamics?)

jlperla commented 4 years ago

Hi all. It looks like I will have some childcare help in June, so maybe we can sketch some stuff out late next week? I think a lot of the markov chains+covid sde stuff could come together quickly. Then the uncertainty quantification/sciml material can be built on at the appropriate speed, since much of it would be new.

jlperla commented 4 years ago

@jstac @thomassargent30 Sorry the the delay. @ChrisRackauckas and I are going to power through the basic covid lecture at the beginning of next week.

I think our intention was to:

Start by porting over a chunk of the https://python.quantecon.org/sir_model.html code to julia for the deterministic model,
Add in maybe one more deterministic bell and whitle. Maybe a SIER model?
Solve ensemblers with aggregate stochastic shocks (i.e. not the randomness coming from the discreteness of the problem, as in the gillespie solutions, which we could handle in a sepraate continuous time lecture).

For this, I think a few of my questions are:

Do you like the idea of adding in a SIER model?
Do we want to cut out any of the current sections in the https://python.quantecon.org/sir_model.html to simplify this and leave more room for other examples and prevent duplication when we look at aggregate shocks?
- For example, only do https://python.quantecon.org/sir_model.html#Ending-Lockdown with the expanded SDE which nests the deterministic SIR ODE.
Thoughts on what interesting aggregate stochastic shocks would be that are believable and economically interesting?
- One would be that the transmission rate itself is stochastic? Then the ODE has a beta(t) in it, as it probably should given teh applications, and it has an SDE term?
- Another is that the proportion of infected people has a random element so that the SDE for the s(t) has a diffusive term. That could represent agggregate flows coming into and out of the geography

jstac commented 4 years ago

Please feel free to cut out the sections you mentioned. Or the lecture could be expanded into two.

Aggregate shocks to the transmission rate would make for interesting dynamics but it's not so obvious what would drive them. Events like the recent street protests?

jstac commented 4 years ago

Here's one potentially interesting idea: "Flattening the curve" is producing more like damped oscillations in some places (Australia included). New cases tend to zero and then spike up again.

What's driving this? I guess it's a learning process. The transmission rate is determined by govt policy and social behavior. Both base their choices on some kind of reinforcement learning mechanism. If new cases are low, behavior relaxes. Case then increase and behavior tightens up again.

Learning could be modeled in a simple way (backward looking weighted average) or a sophisticated way (machine learning).

jlperla commented 4 years ago

A thought on the superset of equations for this, which might nest the examples we are interested in.

Here I have put left things unnormalized, which will be useful if we have aggregate shocks to people entering/exiting the particular geography and counting the cumulative deaths for policy uncertainty.

The key here for the notation is:

S(t) susceptible
E(t) exposed
I(t) infected
R(t) recovered
D(t) is deaths
C(t) is cumulative deaths (easy to just integrate in the ODE solver rather than outside).
beta(t) is the reproduction rate

With these, the "policy" related choices are:

policy and individual choice enters through the \bar{\beta} in the mean, but there are lags in the implementation/diffusion of information about the policy.
v(t) is the death rate. The time variation could be used as the arrival rate of medical advancements
- I also want to make the v(t) a nonlinear function of the I(t) to have hospital capacity limitations effect v(t). This could enter policy uncertainty.
zeta(t) is related to the stochastic arrival of "outsiders" into the system as we shut down/reopen transportation.

Finally, the aggregate stochastic elements are

the \zeta(t) dW random entry/exit
The theta dW variation in the reproduction rate. Think whether there are rallies or not, aggregate fluctuations in willingness to use masks, facebook misinformation, newspaper artiles, etc.
The v(t) could be related to a stochastic process, for example a jump-process.

ChrisRackauckas commented 4 years ago

An additive noise process like that will make it not have guaranteed positivity. We should probably make the noise multiplicative/proportional, which should be an easy way to impose that property.

jlperla commented 4 years ago

We should probably make the noise multiplicative/proportional, which should be an easy way to impose that property.

So you mean like d s(t) = beta(t) s(t) i(t) + s(t) zeta d W(t) ? If so, that is perfect and clean. Probably more realistic as well. I tried to work with logged s(t), etc. to keep thing positive, but it got uncessarily ugly which was why I had given up.

OK, I will take another shot at the ODEs this morning for discussion.

@ChrisRackauckas Are you aware of any simple extensions of the SIR that have testing and/or asymptomatic carriers with them?

ChrisRackauckas commented 4 years ago

So you mean like d s(t) = beta(t) s(t) i(t) + s(t) zeta d W(t) ?

Yup that's it.

Probably more realistic as well. I tried to work with logged s(t), etc. to keep thing positive, but it got uncessarily ugly which was why I had given up.

There's technically an Ito correction term if you do it with an SDE, and that's... definitely not something for an introduction to the model type haha.

Are you aware of any simple extensions of the SIR that have testing and/or asymptomatic carriers with them?

I don't, but maybe @sdwfrost might be able to help guide us towards the right model.

sdwfrost commented 4 years ago

There's lots of models out there! This one includes asymptomatic and hospitalization:

https://github.com/covid-projections/covid-data-model/blob/master/pyseir/models/seir_model.py

Most of the models that consider asymptomatic, testing etc. are also age structured, which adds another layer of complexity. You can take a look at an incomplete list of models with their characteristics here

jlperla commented 4 years ago

@sdwfrost Thanks so much. These are outstanding, and that is exactly what we were looking for to compare. We will take a look to see if we can find a parsimonious version to demonstrate some of the tools we have in mind for economists.

QuantEcon / lecture-source-jl

Thoughts on covid and SciML lecture outline #859

From Markov Models to ODEs and SDEs: the Wide Range of Epidemic Model Approximations

Inverse Problems, Control, and Scientific Machine Learning