paul-buerkner / brms

brms R package for Bayesian generalized multivariate non-linear multilevel models using Stan
https://paul-buerkner.github.io/brms/
GNU General Public License v2.0
1.27k stars 182 forks source link

Make 'cor_arr' useful again #631

Closed paul-buerkner closed 5 years ago

paul-buerkner commented 5 years ago

The cor_arr correlation structure is currently deprecated because it does not really do something useful that one cannot otherwise do via simple linear predictor terms. In particular, predictions are not as general as they should be for time-series models. However, I think we could implement this structure very similar to how ARMA works in brms, which would actually make cor_arr somewhat useful again.

paul-buerkner commented 5 years ago

I tried to make it more useful but wasn't successful. I guess it's really just the best to remove it. If, at some point, we get a reasonable idea, I may as well implement it from scratch.

andymilne commented 5 years ago

I have to say that I find it a very useful "convenience function" -- it's easier than having to fiddle around with the data frame (so I would prefer it not to be deprecated). But I am interested to understand in what way you you were thinking of making it more useful.

paul-buerkner commented 5 years ago

I understand that you find it convenient, but I think that (a) it's in the wrong place (i.e. it shouldn't be passed to autocor) and (b) it is not unlikely to yield non-sensible results. I thought it could be helpful for models where we don't have ARMA models (so bascially everything else than gaussian models), but now that I have tried again (and failed), I am even more motivated to just remove it.

andymilne commented 5 years ago

I am interested to learn how it might lead to non-sensible results.

paul-buerkner commented 5 years ago

This may happen if the mean of time-series is far away from non-zero. There are "quick fixes" for that but no principled ones I like. Here is an example:

N <- length(LakeHuron)
df <- data.frame(
  y = as.numeric(LakeHuron),
  year = as.numeric(time(LakeHuron)),
  time = 1:N
)

library(ggplot2)
library(brms)
fit <- brm(
  y ~ 1, 
  data = df, 
  autocor = cor_arr(~time, r = 2),
)

preds <- posterior_predict(fit)
preds <- cbind(
  Estimate = colMeans(preds), 
  Q5 = apply(preds, 2, quantile, probs = 0.05),
  Q95 = apply(preds, 2, quantile, probs = 0.95)
)

ggplot(cbind(df, preds), aes(x = year, y = Estimate)) +
  geom_smooth(aes(ymin = Q5, ymax = Q95), stat = "identity", size = 0.5) +
  geom_point(aes(y = y))

Bottom line is, I implemented this correlation structure ages ago and it was simply a bad idea.

andymilne commented 5 years ago

Interesting, thank you.

hansvancalster commented 5 years ago

This may happen if the mean of time-series is far away from non-zero. There are "quick fixes" for that but no principled ones I like. Here is an example:

N <- length(LakeHuron)
df <- data.frame(
  y = as.numeric(LakeHuron),
  year = as.numeric(time(LakeHuron)),
  time = 1:N
)

library(ggplot2)
library(brms)
fit <- brm(
  y ~ 1, 
  data = df, 
  autocor = cor_arr(~time, r = 2),
)

preds <- posterior_predict(fit)
preds <- cbind(
  Estimate = colMeans(preds), 
  Q5 = apply(preds, 2, quantile, probs = 0.05),
  Q95 = apply(preds, 2, quantile, probs = 0.95)
)

ggplot(cbind(df, preds), aes(x = year, y = Estimate)) +
  geom_smooth(aes(ymin = Q5, ymax = Q95), stat = "identity", size = 0.5) +
  geom_point(aes(y = y))

Bottom line is, I implemented this correlation structure ages ago and it was simply a bad idea.

I'm not sure if this is helpful to find a principled solution, but if you fit the LakeHuron data with INLA (continuing from the quoted code above):

library(INLA)
family <- "gaussian"
formula1 <- y~ f(time, model = 'ar1')
res1 <- inla(formula = formula1,
             data = df,
             family = family)
plot(res1$summary.random$t[ ,"mean"]+res1$summary.fixed$mean[1],
     ylab="fitting result",
     type="l")
points(df$y, col="blue")

the fitted line obtained by the brms model corresponds to the res1$summary.fixed$mean[1] part (= the intercept). The res1$summary.random$t[ ,"mean"] part is the deviation from the intercept (the AR1 part). The INLA model is overfitted, which can be remedied by fixing one of the hyperparameters or using PC priors: see https://haakonbakka.bitbucket.io/btopic115.html. See also inla.doc("ar1") for details of the way it is implemented in INLA.

Given the usefulness of these autoregression-of-the-response models especially for non-gaussian data, I'd vote :+1: to keep or re-implement this functionality in brms and hope that a principled approach for the above problematic case can be found.

paul-buerkner commented 5 years ago

Thanks for following up on this. I am not sure I fully understand the above INLA example. Does AR1 not mean AR1 of residuals (available via cor_ar) or is it really of the AR1 of the response in INLA? Based on the doc it seems to be equivalent to cor_ar in brms. It is just written down for a simple example where both are equivalent.

hansvancalster commented 5 years ago

In the example section they give an example with family = "poisson". Because the Poisson family has no residuals, I was let to believe it is autoregression of the response. But I may well misunderstand something, because in the beginning of inla.doc("ar1") they indeed speak of a Gaussian vector.

Here is the example they give :

library(INLA)
#simulate data
n = 100
rho = 0.8
prec = 10
## note that the marginal precision would be
marg.prec = prec * (1-rho^2)
E=sample(c(5,4,10,12),size=n,replace=T)
eta = as.vector(arima.sim(list(order = c(1,0,0), ar = rho), n = n,sd=sqrt(1/prec)))
y=rpois(n,E*exp(eta))
data = list(y=y, z=1:n, E=E)
## fit the model
formula = y~f(z,model="ar1")
res1 = inla(formula,family="poisson", data = data, E=E)
hansvancalster commented 5 years ago

Just talked to @ThierryO about this. Apparently the INLA ar1 is neither autoregression of residuals, nor autoregression of the respons. The autoregression is at the level of the random effect parameters. In INLA, an ar1 random effect for time, imposes correlation between discrete time periods with time periods further apart less correlated. If the correlation parameter is estimated to be zero, the ar1-type random effect reduces to an iid random effect (i.e. (1|time)).

Hope this clarifies it...

paul-buerkner commented 5 years ago

It does clarify this and basically tells use where to implement such a feature (i.e., in the multilevel structure of brms). Thanks a lot!

andymilne commented 5 years ago

Maybe a related approach would also allow autoregression for Bernoulli and cumulative distributions?: https://cran.r-project.org/web/packages/mvord/vignettes/vignette_mvord.pdf

paul-buerkner commented 5 years ago

Indeed that could work although residuals in the binary case are not going to be identified. for all other families without a built-in dispersion parameter this should work

Andrew Milne notifications@github.com schrieb am Fr., 19. Apr. 2019, 15:01:

Maybe a related approach would also allow autoregression for Bernoulli and cumulative distributions?: https://cran.r-project.org/web/packages/mvord/vignettes/vignette_mvord.pdf

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/paul-buerkner/brms/issues/631#issuecomment-484888808, or mute the thread https://github.com/notifications/unsubscribe-auth/ADCW2AEWT7RIKHYEEPYAK3TPRG7DNANCNFSM4HD4RNJA .