stan-dev / rstanarm

rstanarm R package for Bayesian applied regression modeling
https://mc-stan.org/rstanarm
GNU General Public License v3.0
385 stars 132 forks source link

Question on "_NEW_" #600

Open shijbian opened 1 year ago

shijbian commented 1 year ago

Summary:

The reason for having a new group level called "NEW" when performing stan_glmer

Description:

Why do we have a variable called b[(Intercept) Tree:_NEW_Tree]?

When I ran the R code below:

library(MASS) 
data()
nm1 <- stan_glmer(circumference ~ age + (1|Tree),data = Orange)
nm1_estimates <- rstan::extract(nm1$stanfit) 
head(nm1_estimates$b)

The matrix nm1_estimates$b has 6 columns. The col 1 to 5 corresponds to the five-level of "Tree", or b[(Intercept) Tree:3]:b[(Intercept) Tree:4]. The 6th column corresponds to b[(Intercept) Tree:_NEW_Tree]. b[(Intercept) Tree:3], b[(Intercept) Tree:4] and b[(Intercept) Tree:_NEW_Tree] are shown below:

nm1$stanfit

Why do we need a level b[(Intercept) Tree:_NEW_Tree]? How do we interpret it when we interpret the group effects?

According to the vignettes, "These random draws from the posterior distribution of the group-specific parameters are stored each time a joint model is estimated using stan_glmer, stan_mvmer, or stan_jm; they are saved under an ID value called "NEW".

Is the reason for having b[(Intercept) Tree:_NEW_Tree] because some levels might be missing when performing MCMC due to random simulation?

Reproducible Steps:

library(MASS) data() nm1 <- stan_glmer(circumference ~ age + (1|Tree),data = Orange) nm1_estimates <- rstan::extract(nm1$stanfit) head(nm1_estimates$b)

nm1$stanfit

RStanARM Version:

‘2.21.4’

R Version:

‘4.3.1’

Operating System:

macOS 11.7.7 (20G1345)