merliseclyde / BAS

BAS R package for Bayesian Model Averaging and Variable Selection
https://merliseclyde.github.io/BAS/
GNU General Public License v3.0
42 stars 16 forks source link

Odd behavior Beta-binomial model prior with include.always #40

Closed vandenman closed 5 years ago

vandenman commented 5 years ago

Describe the bug

This is more a theoretical issue rather than a programming bug. Prior model probabilities of a beta-binomial model prior are surprising and perhaps incorrect when always including some predictors.

To Reproduce

A small example below:

rm(list = ls())
library(BAS)

getModelName <- function(x, params) {
  x <- x[-1L]
  if (length(x))
    return(paste(params[x], collapse = " + "))
  else return("Null model")
}

getPriorProbsPerModel <- function(basObj) {
  params <- basObj$namesx[-1L]
  paramsInModels <- sapply(basObj$which, getModelName, params = params)
  out <- data.frame(model = paramsInModels, priorProb = basObj$priorprobs)
  return(out[order(lengths(basObj$which)), ])
}

dat <- read.table("https://raw.githubusercontent.com/jasp-stats/jasp-desktop/development/Resources/Data%20Sets/Big%205%20(Dolan%2C%20Oort%2C%20Stoel%20%26%20Wicherts%2C%202009).csv", header = TRUE)
head(dat)

modelprior <- beta.binomial(1.0, 1.0)

# constraint via include.always
fit1 <- bas.lm(Neuroticism ~ Extraversion + Openness + Agreeableness, data = dat, modelprior = modelprior, 
               include.always = ~ 1 + Agreeableness)

# no constraint
fit2 <- bas.lm(Neuroticism ~ Extraversion + Openness + Agreeableness, data = dat, modelprior = modelprior)

which gives the following prior model probabilities:

getPriorProbsPerModel(fit1) # constrained model, always includes Agreeableness
                                    model  priorProb
1                           Agreeableness 0.0833 # <- Least complex model, low prob
3                Openness + Agreeableness 0.0833
4            Extraversion + Agreeableness 0.0833
2 Extraversion + Openness + Agreeableness 0.25   # <- most complex model, highest prob
getPriorProbsPerModel(fit2) # unconstrained model, as expected
                                    model  priorProb
1                              Null model 0.25
3                                Openness 0.0833
5                            Extraversion 0.0833
7                           Agreeableness 0.0833
2                Openness + Agreeableness 0.0833
4                 Extraversion + Openness 0.0833
6            Extraversion + Agreeableness 0.0833
8 Extraversion + Openness + Agreeableness 0.25

So it seems that BAS computes the prior probabilities as if there is no constraint and then assigns 0 prior probability to models that do not include Agreeableness. If the prior model probabilities of the constrained model are normalized we get:

tb <- getPriorProbsPerModel(fit1)
tb[, 2] <- tb[, 2] / sum(tb[, 2])
tb
                                    model priorProb
1                           Agreeableness 0.1666667
3                Openness + Agreeableness 0.1666667
4            Extraversion + Agreeableness 0.1666667
2 Extraversion + Openness + Agreeableness 0.5000000 # <- favors the most complex model

However, this may lead to biased inference as there is a strong prior preference for the most complex model!

Expected behavior

I'd expect these prior model probabilties:

tb2 <- getPriorProbsPerModel(fit1)
tb2[, 2] <- (extraDistr::dbbinom(0:2, 2, 1, 1) / c(choose(2, 0:2)))[c(1, 2, 2, 1)]
tb2
                                    model priorProb
1                           Agreeableness 0.333
3                Openness + Agreeableness 0.167
4            Extraversion + Agreeableness 0.167
2 Extraversion + Openness + Agreeableness 0.333

Effectively, because one predictor included in all models, I would expect that the prior model probabilities are computed as if the model space contained one predictor less.

This example shows what I would expect when always including one predictor but naturally this generalizes to always including l out of k predictors.

Desktop:

If anything is unclear, please let me know!

merliseclyde commented 5 years ago

Merged in BAS version 1.5.4