Closed ASKurz closed 3 years ago
Unfortunately I wasn't able to get this model to work in brms. In response to this question about brms, Buerkner says that brms is not set up to handle separate outputs from the same function, which is basically what we are after. Based on his response, I don't think brms is set up of the type of model used by McElreath.
Steve
Got it. Thanks for the update, @sjwild.
Let me get back to you in the next couple of days. The trick is to introduce an indicator variable to distinguish between hare and lynx (or paid and outstanding claims in my example). I mentioned this in my blog post, and also in section 3.4.3.1 of Gesmann, M., and Morris, J. “Hierarchical Compartmental Reserving Models.” Casualty Actuarial Society, CAS Research Papers, 19 Aug. 2020, https://www.casact.org/research/research-papers/Compartmental-Reserving-Models-GesmannMorris0820.pdf
Thank you, @mages. Any help would be most appreciated.
I published a short blog post that demonstrates how you can fit the Lotka-Volterra model with brms
. I hope this provides you with enough details for your own purpose: https://magesblog.com/post/2021-02-08-fitting-multivariate-ode-models-with-brms/
Wow, an entire blog post on the topic; this looks stellar!
Might as well. It was in the back of my mind for some time. I hope the post clarifies things for you.
I'm still slowly working my way through your example, which I'm finding very instructive. Could you clarify a point of confusion?
Bob Carpenter's model (here) seems to be parameterized such that his "intercepts"--what he called z_init[]
--provide the estimates for when $t = 1$. However, the corresponding parameters in your model--called stdNHare0_Intercept
and stdNLynx0_Intercept
--seem to provide estimates for when $t = 0$, after they are converted back to the data metric. Where is the difference in the code he and you used that accounts for this? And further, how might one alter your code to parameterized the intercepts to return the estimates for when $t = 1$?
I am not sure I understand your question. In Bob's section "Lotka-Volterra error model" he writes z(0) = zini. Where does he say that he provides those parameters for t=1?
The term intercept is confusing in the context of non-linear models, particular for parameters that are not time dependent, such as birth and mortality rates. Indeed, Bob doesn't use the term 'intercept'. In my case labels 'Intercept' are assigned automatically by 'brms'.
In the case of autonomous ODEs the solution is defined by its initial parameters and convention has it that we set them at t=0 and use our observations to fit those initial start points and parameters, regardless if we have data for t=0. However, the start points don't always have to be parameters. Suppose you wanted to model the flow of water from one tank into another. Then you know exactly how much water was in the first tank and the empty tank at t=0.
My wording wasn’t clear. Let me try again:
To my eye, it seems as if Bob’s z_init[1]
and z_init[2]
parameters are the analogues of your Hare0
and Lynx0
. Is this not correct?
Yet focusing just on point estimates, Bob’s z_init[1] = 33.956
and z_init[2] = 5.933
. Your Hare0 = 23.61742242
and Lynx0 = 6.59691147
.
Also, when I look at the original data values and at your posterior predictive plot, it appears that Bob’s two values match up well with the expectations when t = 1, whereas your estimates match up with the expectations when t = 0.
I think, I get it now. I treat my initial parameters as 'free' parameters and use the data to fit the model from t=1,2,... N. In Bob's case he has two input data sets: y and y_init, which I am not sure how to deal with in brms
In my case I can predict the values at t=1 from the model. Those are close to Bob's values:
pred <- predicted_draws(mod, newdata = LH, n= 1000)
data.table(pred)[t==1, .(Pelts_at_t1=mean(.prediction)) , by=Specie]
Specie Pelts_at_t1
1: Lynx 6.132636
2: Hare 35.672955
I have to think about this a little more.
Hang on, I think I made a school boy error. Should I have set
// Set initial values
y0[1] = 30.0; y0[2] = 4.0;
in the LV
function?
Give me a few minutes to run my model again.
For what it's worth, once you factor in McElreath's p
parameters, here are his analogues to Hare0 and Lynx0:
141.6329 * 0.1794219 == 25.41204
36.54094 * 0.1753958 == 6.409127
No schoolboy error. The following statement from Bob shows that he explicitly provides data for the initial start date and then remove that observation from y:
y_init <- c(lynx_hare_df$Hare[1], lynx_hare_df$Lynx[1])
y <- as.matrix(lynx_hare_df[2:(N + 1), 2:3])
As I said above. I treat the initial state as a 'free' parameter with no observation at t=0 and start using the data from t=1. So, I believe both approaches are valid and provide similar results for t=1 and the time independent parameters are close too, although I use different priors.
Got it. Thank you for the clarification!
Good question! Btw, I just noticed that I actually made a sloppy schoolboy error when I presented the back transformed estimated parameters. I have to transform the posterior samples and not the output of fixef
. That's fixed now.
Sure. It was just the standard deviation (Est.Error
) that didn't transform properly with your initial fixef()
method, right?
Pretty much, it was sloppy.
Yeah, those nasty posterior SD’s can make transformation a bear. You know mistakes like that are slipping through the peer-review process. So it goes…
If you’re game, I have another question:
I’d like to adapt your workflow to incorporate McElreath’s missing data model. Would you recommend an approach like the one below?
frml <-
bf(
# These two lines have changed
Pelts ~ eta * p,
p ~ 0 + Specie,
nlf(eta ~ log(
LV(t, Hare0, Lynx0,
brHare, mrHare,brLynx, mrLynx, delta)
)
),
nlf(Hare0 ~ 10 * exp(stdNHare0)),
nlf(Lynx0 ~ 10 * exp(stdNLynx0)),
nlf(brHare ~ 0.5 * exp(0.25 * stdNbrHare)),
nlf(mrHare ~ 0.025 * exp(0.25 * stdNmrHare)),
nlf(brLynx ~ 0.025 * exp(0.25 * stdNbrLynx)),
nlf(mrLynx ~ 0.8 * exp(0.25 * stdNmrLynx)),
stdNHare0 ~ 1, stdNLynx0 ~ 1,
stdNbrHare ~ 1, stdNmrHare ~ 1,
stdNbrLynx ~ 1, stdNmrLynx ~ 1,
sigma ~ 0 + Specie,
nl = TRUE
)
# with new prior
prior(beta(40, 200), nlpar = p, lb = 0, ub = 1)
Or would you work the p
parameters into your LotkaVolterra
script, somehow?
Looks reasonable to me. Give it go.
Took a hot minute, but it turns out the correct approach is this:
frml <-
bf(
# These three lines have changed
Pelts ~ log(eta * p),
p ~ 0 + Specie,
nlf(eta ~ LV(t, Hare0, Lynx0, brHare, mrHare,brLynx, mrLynx, delta)),
nlf(Hare0 ~ 10 * exp(stdNHare0)),
nlf(Lynx0 ~ 10 * exp(stdNLynx0)),
nlf(brHare ~ 0.5 * exp(0.25 * stdNbrHare)),
nlf(mrHare ~ 0.025 * exp(0.25 * stdNmrHare)),
nlf(brLynx ~ 0.025 * exp(0.25 * stdNbrLynx)),
nlf(mrLynx ~ 0.8 * exp(0.25 * stdNmrLynx)),
stdNHare0 ~ 1, stdNLynx0 ~ 1,
stdNbrHare ~ 1, stdNmrHare ~ 1,
stdNbrLynx ~ 1, stdNmrLynx ~ 1,
sigma ~ 0 + Specie,
nl = TRUE
)
# with new prior
prior(beta(40, 200), nlpar = p, lb = 0, ub = 1)
As long as you set inits = 0
within brm()
, it fits like a dream.
That's great news! Of course, in hindsight it is obvious that 'p' has to be insight the log statement. Do you get similar estimates when compared with McElreath model?
Yeah, it's really close. The posterior_samples from McElreath's model are nicer in that you get the draws for all the time points in both the pelts counts and the latent animal counts. But you can get those with a little brms post processing, if you're careful.
Congratulations!
:beers:
Congratulations!
In Section 16.4.2, McElreath fit a bivariate population dynamics model using ordinary differential equations (ODEs). His proposed statistical model was
McElreath fit the model with
rstan::stan()
. His Stan model code comes in the rethinking package. Here's how to fit the model.If you execute
cat(Lynx_Hare_model)
, you can see the Stan model code.Based on @mages's blog post, PK/PD reserving models, I know ODE models are possible with brms. However, this is beyond my current skill set.