Closed wpetry closed 4 years ago
I repeated the model after changing the rw1
formula from ~0
to ~1
. I worried that fitting a zero intercept here may have forced the predictions to instead of using the current state as in . The unexpected behavior (excess 0.5) persists despite this change.
Thanks for a good example. I will get back to this tomorrow, there seems to be an issue with the underlying C++ function as using predict(rwfit, newdata = ndat, u = nobs,type="response")
gives an out-of-bounds error.
And there seems to be an issue with the plot_predict as well regarding whether to plot observations or means which needs some attentions.
Thanks for looking into this. I'm still a little confused about the correct way to specify the RHS of the rw1
formula for this data generating process. Would you please share the correct model code once the issue is resolved?
You're right about the opportunity to add a feature to plot_predict
when working with non-Gaussian errors. I'll open a separate issue since this is unrelated to the bug described here.
Ok, there was a very simple indexing bug with the predict function which for some weird reason I hadn't spotted before, so I managed to fix this already.
Wow, thanks for turning this around so quickly. I'm glad it was a relatively easy fix. Everything is now behaving as expected on my end.
brief description of the model
I'm attempting to implement a pure random walk model in which the underlying data generating process is described by where y is a proportion between [0, 1]. The underlying state, y, is observed through counts of successes and failures using a logit link.
description of the unexpected behavior
The problem is that when I forecast from model fit with
walker_glm(distribution = "binomial")
, the prediction intervals behave strangely. Specifically, there is a very large excess of predictions at exactly 0.5 (corresponding to zero on the logit scale). The remainder of the posterior predictive distribution aligns with my expectations. We can visualize this problem in the posterior predictive intervals,and in the posterior predictions at
time=101
and
time=250
potentially important clue
I noticed that all timepoints appear to have the same exact number of posterior predictions that equal 0.5 (again, this is the same as saying zero on the logit scale). You can see this in the histograms above where 1498 draws equal 0.5 at both
t=101
andt=250
(the exact count will vary depending on the random seed). The reprex below provides code that shows this is true across all time points. I can't think of a mechanism that would cause consistency of posterior draws across every time point.reprex
We can simulate
nt=100
time steps of the true datay
, the number of observed successessucc
out ofnobs
trials, and the estimated state on the probability scaleyhat
as:We can then fit the model using
walker_glm()
with a very weak prior on the value of yhat at time=0, and confirm that the parameters are correctly recovered:Finally, make the forecast for times 101 to 300:
Thank you, @helske, for your work on this package to make state space modeling intuitive and efficient.