ASKurz / Statistical_Rethinking_with_brms_ggplot2_and_the_tidyverse_2_ed

The bookdown version lives here:
https://bookdown.org/content/4857/
Creative Commons Zero v1.0 Universal
123 stars 37 forks source link

Model 16.5 #18

Closed ASKurz closed 3 years ago

ASKurz commented 3 years ago

In Section 16.4.2, McElreath fit a bivariate population dynamics model using ordinary differential equations (ODEs). His proposed statistical model was

Screen Shot 2020-11-26 at 10 10 43 AM

McElreath fit the model with rstan::stan(). His Stan model code comes in the rethinking package. Here's how to fit the model.

# load
library(rethinking)

# data
data(Lynx_Hare)

dat_list <- list(
  N = nrow(Lynx_Hare), 
  pelts = Lynx_Hare[,2:3] )

# Stan code
data(Lynx_Hare_model)

# fit the model
m16.5 <- stan( model_code=Lynx_Hare_model , 
               data=dat_list , 
               chains=3 , cores=3 , 
               control=list( adapt_delta=0.95 ) )

If you execute cat(Lynx_Hare_model), you can see the Stan model code.

functions {
  real[] dpop_dt( real t,                 // time
                real[] pop_init,          // initial state {lynx, hares}
                real[] theta,             // parameters
                real[] x_r, int[] x_i) {  // unused
    real L = pop_init[1];
    real H = pop_init[2];
    real bh = theta[1];
    real mh = theta[2];
    real ml = theta[3];
    real bl = theta[4];
    // differential equations
    real dH_dt = (bh - mh * L) * H;
    real dL_dt = (bl * H - ml) * L;
    return { dL_dt , dH_dt };
  }
}
data {
  int<lower=0> N;              // number of measurement times
  real<lower=0> pelts[N,2];    // measured populations
}
transformed data{
  real times_measured[N-1];    // N-1 because first time is initial state
  for ( i in 2:N ) times_measured[i-1] = i;
}
parameters {
  real<lower=0> theta[4];      // { bh, mh, ml, bl }
  real<lower=0> pop_init[2];   // initial population state
  real<lower=0> sigma[2];      // measurement errors
  real<lower=0,upper=1> p[2];  // trap rate
}
transformed parameters {
  real pop[N, 2];
  pop[1,1] = pop_init[1];
  pop[1,2] = pop_init[2];
  pop[2:N,1:2] = integrate_ode_rk45(
    dpop_dt, pop_init, 0, times_measured, theta,
    rep_array(0.0, 0), rep_array(0, 0),
    1e-5, 1e-3, 5e2);
}
model {
  // priors
  theta[{1,3}] ~ normal( 1 , 0.5 );    // bh,ml
  theta[{2,4}] ~ normal( 0.05, 0.05 ); // mh,bl
  sigma ~ exponential( 1 );
  pop_init ~ lognormal( log(10) , 1 );
  p ~ beta(40,200);
  // observation model
  // connect latent population state to observed pelts
  for ( t in 1:N )
    for ( k in 1:2 )
      pelts[t,k] ~ lognormal( log(pop[t,k]*p[k]) , sigma[k] );
}
generated quantities {
  real pelts_pred[N,2];
  for ( t in 1:N )
    for ( k in 1:2 )
      pelts_pred[t,k] = lognormal_rng( log(pop[t,k]*p[k]) , sigma[k] );
}

Based on @mages's blog post, PK/PD reserving models, I know ODE models are possible with brms. However, this is beyond my current skill set.

sjwild commented 3 years ago

Unfortunately I wasn't able to get this model to work in brms. In response to this question about brms, Buerkner says that brms is not set up to handle separate outputs from the same function, which is basically what we are after. Based on his response, I don't think brms is set up of the type of model used by McElreath.

Steve

ASKurz commented 3 years ago

Got it. Thanks for the update, @sjwild.

mages commented 3 years ago

Let me get back to you in the next couple of days. The trick is to introduce an indicator variable to distinguish between hare and lynx (or paid and outstanding claims in my example). I mentioned this in my blog post, and also in section 3.4.3.1 of Gesmann, M., and Morris, J. “Hierarchical Compartmental Reserving Models.” Casualty Actuarial Society, CAS Research Papers, 19 Aug. 2020, https://www.casact.org/research/research-papers/Compartmental-Reserving-Models-GesmannMorris0820.pdf

ASKurz commented 3 years ago

Thank you, @mages. Any help would be most appreciated.

mages commented 3 years ago

I published a short blog post that demonstrates how you can fit the Lotka-Volterra model with brms. I hope this provides you with enough details for your own purpose: https://magesblog.com/post/2021-02-08-fitting-multivariate-ode-models-with-brms/

ASKurz commented 3 years ago

Wow, an entire blog post on the topic; this looks stellar!

mages commented 3 years ago

Might as well. It was in the back of my mind for some time. I hope the post clarifies things for you.

ASKurz commented 3 years ago

I'm still slowly working my way through your example, which I'm finding very instructive. Could you clarify a point of confusion?

Bob Carpenter's model (here) seems to be parameterized such that his "intercepts"--what he called z_init[]--provide the estimates for when $t = 1$. However, the corresponding parameters in your model--called stdNHare0_Intercept and stdNLynx0_Intercept--seem to provide estimates for when $t = 0$, after they are converted back to the data metric. Where is the difference in the code he and you used that accounts for this? And further, how might one alter your code to parameterized the intercepts to return the estimates for when $t = 1$?

mages commented 3 years ago

I am not sure I understand your question. In Bob's section "Lotka-Volterra error model" he writes z(0) = zini. Where does he say that he provides those parameters for t=1?

The term intercept is confusing in the context of non-linear models, particular for parameters that are not time dependent, such as birth and mortality rates. Indeed, Bob doesn't use the term 'intercept'. In my case labels 'Intercept' are assigned automatically by 'brms'.

In the case of autonomous ODEs the solution is defined by its initial parameters and convention has it that we set them at t=0 and use our observations to fit those initial start points and parameters, regardless if we have data for t=0. However, the start points don't always have to be parameters. Suppose you wanted to model the flow of water from one tank into another. Then you know exactly how much water was in the first tank and the empty tank at t=0.

ASKurz commented 3 years ago

My wording wasn’t clear. Let me try again:

To my eye, it seems as if Bob’s z_init[1] and z_init[2] parameters are the analogues of your Hare0 and Lynx0. Is this not correct?

Yet focusing just on point estimates, Bob’s z_init[1] = 33.956 and z_init[2] = 5.933. Your Hare0 = 23.61742242 and Lynx0 = 6.59691147.

Also, when I look at the original data values and at your posterior predictive plot, it appears that Bob’s two values match up well with the expectations when t = 1, whereas your estimates match up with the expectations when t = 0.

mages commented 3 years ago

I think, I get it now. I treat my initial parameters as 'free' parameters and use the data to fit the model from t=1,2,... N. In Bob's case he has two input data sets: y and y_init, which I am not sure how to deal with in brms

In my case I can predict the values at t=1 from the model. Those are close to Bob's values:

pred <- predicted_draws(mod, newdata = LH, n= 1000)
data.table(pred)[t==1, .(Pelts_at_t1=mean(.prediction)) , by=Specie]
   Specie Pelts_at_t1
1:   Lynx    6.132636
2:   Hare   35.672955

I have to think about this a little more.

mages commented 3 years ago

Hang on, I think I made a school boy error. Should I have set

  // Set initial values
  y0[1] = 30.0; y0[2] = 4.0;

in the LV function?

Give me a few minutes to run my model again.

ASKurz commented 3 years ago

For what it's worth, once you factor in McElreath's p parameters, here are his analogues to Hare0 and Lynx0:

141.6329 * 0.1794219 == 25.41204
36.54094 * 0.1753958 == 6.409127
mages commented 3 years ago

No schoolboy error. The following statement from Bob shows that he explicitly provides data for the initial start date and then remove that observation from y:

y_init <- c(lynx_hare_df$Hare[1], lynx_hare_df$Lynx[1])
y <- as.matrix(lynx_hare_df[2:(N + 1), 2:3])

As I said above. I treat the initial state as a 'free' parameter with no observation at t=0 and start using the data from t=1. So, I believe both approaches are valid and provide similar results for t=1 and the time independent parameters are close too, although I use different priors.

ASKurz commented 3 years ago

Got it. Thank you for the clarification!

mages commented 3 years ago

Good question! Btw, I just noticed that I actually made a sloppy schoolboy error when I presented the back transformed estimated parameters. I have to transform the posterior samples and not the output of fixef. That's fixed now.

ASKurz commented 3 years ago

Sure. It was just the standard deviation (Est.Error) that didn't transform properly with your initial fixef() method, right?

mages commented 3 years ago

Pretty much, it was sloppy.

ASKurz commented 3 years ago

Yeah, those nasty posterior SD’s can make transformation a bear. You know mistakes like that are slipping through the peer-review process. So it goes…

If you’re game, I have another question:

I’d like to adapt your workflow to incorporate McElreath’s missing data model. Would you recommend an approach like the one below?

frml <- 
  bf(
    # These two lines have changed
    Pelts ~ eta * p,
    p ~ 0 + Specie,

    nlf(eta ~ log(
      LV(t, Hare0, Lynx0, 
         brHare, mrHare,brLynx, mrLynx, delta)
    )
    ),
    nlf(Hare0 ~ 10 * exp(stdNHare0)),
    nlf(Lynx0 ~ 10 * exp(stdNLynx0)),
    nlf(brHare ~ 0.5 * exp(0.25 * stdNbrHare)),
    nlf(mrHare ~ 0.025 * exp(0.25 * stdNmrHare)),
    nlf(brLynx ~ 0.025 * exp(0.25 * stdNbrLynx)),
    nlf(mrLynx ~ 0.8 * exp(0.25 * stdNmrLynx)),
    stdNHare0 ~ 1,  stdNLynx0 ~ 1,
    stdNbrHare ~ 1, stdNmrHare ~ 1,
    stdNbrLynx ~ 1, stdNmrLynx ~ 1,
    sigma ~ 0 + Specie,
    nl = TRUE
  )

# with new prior
prior(beta(40, 200), nlpar = p, lb = 0, ub = 1)

Or would you work the p parameters into your LotkaVolterra script, somehow?

mages commented 3 years ago

Looks reasonable to me. Give it go.

ASKurz commented 3 years ago

Took a hot minute, but it turns out the correct approach is this:

frml <- 
  bf(
    # These three lines have changed
    Pelts ~ log(eta * p),
    p ~ 0 + Specie,
    nlf(eta ~ LV(t, Hare0, Lynx0, brHare, mrHare,brLynx, mrLynx, delta)),

    nlf(Hare0 ~ 10 * exp(stdNHare0)),
    nlf(Lynx0 ~ 10 * exp(stdNLynx0)),
    nlf(brHare ~ 0.5 * exp(0.25 * stdNbrHare)),
    nlf(mrHare ~ 0.025 * exp(0.25 * stdNmrHare)),
    nlf(brLynx ~ 0.025 * exp(0.25 * stdNbrLynx)),
    nlf(mrLynx ~ 0.8 * exp(0.25 * stdNmrLynx)),
    stdNHare0 ~ 1,  stdNLynx0 ~ 1,
    stdNbrHare ~ 1, stdNmrHare ~ 1,
    stdNbrLynx ~ 1, stdNmrLynx ~ 1,
    sigma ~ 0 + Specie,
    nl = TRUE
  )

# with new prior
prior(beta(40, 200), nlpar = p, lb = 0, ub = 1)

As long as you set inits = 0 within brm(), it fits like a dream.

mages commented 3 years ago

That's great news! Of course, in hindsight it is obvious that 'p' has to be insight the log statement. Do you get similar estimates when compared with McElreath model?

ASKurz commented 3 years ago

Yeah, it's really close. The posterior_samples from McElreath's model are nicer in that you get the draws for all the time points in both the pelts counts and the latent animal counts. But you can get those with a little brms post processing, if you're careful.

mages commented 3 years ago

Congratulations!

ASKurz commented 3 years ago

:beers:

ASKurz commented 3 years ago

The current implementation lives here.

sjwild commented 3 years ago

Congratulations!