stan-dev / rstanarm

rstanarm R package for Bayesian applied regression modeling
https://mc-stan.org/rstanarm
GNU General Public License v3.0
385 stars 132 forks source link

Group level terms not being passed to posterior_predict for stan_nlmer models #544

Open jwdorrough opened 3 years ago

jwdorrough commented 3 years ago

Summary:

For stan_nlmer models, group specific terms are not being passed to posterior_predict.

Description:

I think I have encountered a possible error when trying to draw from the posterior predicted distribution of stan_nlmer models.

It appears that the group specific terms are not being passed to posterior_predict. In other words, it seems to behave as if re.form = NA and any predictions for groups end up drawn from the same posterior distribution. I have been able to replicate the problem using posterior_predict, add_predicted_draws etc.

The example below is drawn from the stan_nlmer vignette https://cran.r-project.org/web/packages/rstanarm/vignettes/glmer.html. If i repeat the example in the vignette and in the new data change the level for Tree to a within sample tree, it generates the same distribution of predicted values as for the out of sample tree. I also tried passing group level terms (e.g. Asy|Tree , ~1|Tree) direct to re.form but this gave errors, although i am not quite sure I specified them correctly (I haven’t got the hang of nlmer group level specification).

data("Orange", package = "datasets")
Orange$age <- Orange$age / 100
Orange$circumference <- Orange$circumference / 100

post1 <- stan_nlmer(circumference ~ SSlogis(age, Asym, xmid, scal) ~ Asym|Tree,
                    data = Orange, cores = 2, seed = 12345, init_r = 0.5)

nd1 <- expand.grid(age = 1:20, Tree = as.factor(1:5))

pp <- posterior_predict(post1, newdata = nd1)

prediction <- colMeans(pp) 
nd_samples <- cbind(nd1, prediction)

ggplot(Orange, aes(x = age, y = circumference, colour = Tree, fill = Tree)) +
  geom_point() +
  geom_line(aes(y=prediction, x=age, colour=Tree), data = nd_samples)
#all predictions are pretty much identical

nd_samples%>%
  filter(age==10)%>%
  group_by(Tree)%>%
  summarise(mn_circumference=mean(prediction))

# # A tibble: 5 x 2
#   Tree  mn_circumference
#   <fct>            <dbl>
# 1 1                 1.32
# 2 2                 1.32
# 3 3                 1.33
# 4 4                 1.33
# 5 5                 1.32

Thanks, Josh

Operating System:Windows10 rstanarm Version:2.21.1