paul-buerkner / brms

brms R package for Bayesian generalized multivariate non-linear multilevel models using Stan
https://paul-buerkner.github.io/brms/
GNU General Public License v2.0
1.29k stars 187 forks source link

Hypothesis tests for calculated/predicted differences? #261

Closed puterleat closed 7 years ago

puterleat commented 7 years ago

I have a couple of questions and perhaps also a suggestion!

Let's say we have a model with multiple covariates. We make predictions from the model (e.g. using posterior_predict or posterior_linpredict) for a grid of values of interest, and we calculate the differences between points in the grid based on the draws, and summarise the posterior for these differences (e.g. between groups in an experiment).

Does it make sense to summarise these difference with an evidence ratio in the same way as brms::hypothesis does for parameter estimates? If so, would it be possible to extend hypothesis to allow for these kind of contrasts between predictions? That is, if you provided a newdata argument each row might get labelled with a letter A...Z and you could write something like this:

fit <- stan_lm(mpg~factor(cyl)+wt+disp, data=mtcars, prior=R2(location=.5))
newdata = expand.grid(cyl=factor(c(4,6,8)), 
        wt=mean(mtcars$wt), disp=mean(mtcars$disp)) %>% 
    mutate(cell = LETTERS[1:n()])

brms::hypothesis(fit, newdata, c("A - B > 2", "A - C < 0"))

And expect the type of output you normall see when doing directed hypotheses.

puterleat commented 7 years ago

Sorry for multiple posts, but I realised this wasn't actually that difficult to code using tidybayes:

prediction.hypotheses <- function(fit, newdata, hypotheses){
  if (is.null(newdata$cell)){
    newdata <- newdata %>% mutate(cell = LETTERS[1:n()])  
  }

  preds <- tidybayes::add_fitted_samples(newdata, fit) %>% 
    reshape2::dcast(.iteration~cell, value.var="estimate")
  return(brms::hypothesis(preds, hypotheses))
}

fit <- stan_lm(mpg~factor(cyl)+wt+disp, data=mtcars, prior=R2(location=.5))
newdata = expand.grid(cyl=factor(c(4,6,8)), wt=mean(mtcars$wt), disp=mean(mtcars$disp)) 
prediction.hypotheses(fit, newdata, c("A - B > 2", "A - C < 0"))

I suppose my question of whether the Evid.Ratio makes sense in this context stands though...

paul-buerkner commented 7 years ago

You can also do it directly via brms:::hypothesis. You just need to pass a data.frame:

fit <- brm(mpg~factor(cyl)+wt+disp, data=mtcars)
newdata = expand.grid(
  cyl=c(4,6,8), 
  wt=mean(mtcars$wt), 
  disp=mean(mtcars$disp)
) %>% 
  mutate(cell = LETTERS[1:n()])

# fitted is the brms name for what rstanarm calls posterior_linpred
fi <- as.data.frame(fitted(fit, newdata, summary = FALSE))
names(fi) <- c("A", "B", "C")

hypothesis(fi, c("A - B > 2", "A - C < 0"))
puterleat commented 7 years ago

That's really neat - thanks. I think I'm still getting my head around this way of making inferences from models, and adjusting to the simplicity of just doing simple arithmetic on the parameter draws or predictions to answer questions of interest. It seems too good to be true!

paul-buerkner commented 7 years ago

Welcome to the world of bayesian statistics! :-)

Am 31.08.2017 10:53 vorm. schrieb "puterleat" notifications@github.com:

Closed #261 https://github.com/paul-buerkner/brms/issues/261.

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/paul-buerkner/brms/issues/261#event-1229123455, or mute the thread https://github.com/notifications/unsubscribe-auth/AMVtABTH_oWY2HyMNnQNZwIeNURn1FeLks5sdnR9gaJpZM4PHNCy .