easystats / bayestestR

:ghost: Utilities for analyzing Bayesian models and posterior distributions
https://easystats.github.io/bayestestR/
GNU General Public License v3.0
571 stars 55 forks source link

Advanced BF for restricted models #177

Closed mattansb closed 5 years ago

mattansb commented 5 years ago

Interval Null tests

I think it might be possible to extend bayesfactor_savagedickey to test against an interval null (kind of a hybrid with ROPE?) according to Morey and Rouder (2011) by extending the order restrictions in bayesfactor_savagedickey (which currently allows the one-sided testing).

One sided tests are: BFr0 = BFr1 Γ— BF10 r = the side restriction

Null interval tests are: BFAN = BFA1 / BFN1 A = restrict to outside the interval; N = restrict to inside the interval

This would require users to specify the interval... Maybe change the hypothesis to null and allow to be of length 1 (for point null) or 2 for (interval null)? (All of this is independent of whether or not the test is one/two sided).

Interestingly, if the null interval is in the form of `[Inf,0], then BF = (pd)/(1-pd)...

Tests parameter restrictions

In essence, we can feed samples of several parameters together and test their joint restrictions. For example, I can have samples for 4 parameters (X1...X4) and we can tests a restricted model of (X1 > 0) & ( X2 > X3 > X4). This will give a BF of the restricted model against the full model (not the null!): BFr1.

I've already implemented something like this in BFEffects::restrict, but it is heavily dependent on purrr and dplyr, and only worked with BayesFactor objects.






If we agree this is something worth investing in, I think the null-interval would happen for at least v3, and the parameter restrictions would happen maybe around v10 πŸ˜…...


TODO:

strengejacke commented 5 years ago

Sounds good! I would suggest to open a new branch for this, because we have to re-submit bayestestR in a very few days, so I'm not sure if we can implement this in time for the very next update.

mattansb commented 5 years ago

Not to worry - I definitely won't start on this for the next 2 weeks or so. But opening a new branch sounds like an excellent idea, as I expect this to be a big shift in the BF functions / docs.

mattansb commented 5 years ago

As a stater...

library(bayestestR)
library(rstanarm)
fit_stan <- stan_glm(mpg ~ wt + cyl + am,
                     data = mtcars)
hyps <- c("am > 0 & cyl < 0",
          "cyl < 0",
          "wt - cyl > 0")

x <- bayesfactor_restrict(fit_stan, hypothesis = hyps)
x
#> Computation of Bayes factors: sampling priors, please wait...
#> # Bayes Factor (Order-Restriction)
#> 
#>        Hypothesis P(Prior) P(Posterior) Bayes Factor
#>  am > 0 & cyl < 0     0.25        0.534         2.16
#>           cyl < 0     0.50        1.000         2.01
#>      wt - cyl > 0     0.52        0.088         0.17
#> ---
#> Bayes factors for the restricted movel vs. the un-restricted model.

colnames(x)
#> [1] "Hypothesis"     "Prior_prob"     "Posterior_prob" "BF"

Created on 2019-06-21 by the reprex package (v0.3.0)

mattansb commented 5 years ago

It seems that for factors, order restrictions only give the desired results when using contr.bayes contrasts (from bfrms)...

mattansb commented 5 years ago
library(bayestestR)
prior <- data.frame(X = rnorm(4000))
posterior <- data.frame(X = rnorm(4000,.5,.4))

# point null
(x <- bayesfactor_savagedickey(posterior,prior, null = 0)) # null has replaced hypothesis
#> Loading required namespace: logspline
#> # Bayes Factor (Savage-Dickey density ratio)
#> 
#>  Parameter Bayes Factor
#>          X         0.83
#> ---
#> Evidence Against The Null: [ 0 ]

# interval null
(x <- bayesfactor_savagedickey(posterior,prior, null = c(-0.1, 0.1))) # results are very close for such a small interval
#> # Bayes Factor (Savage-Dickey density ratio)
#> 
#>  Parameter Bayes Factor
#>          X         0.82
#> ---
#> Evidence Against The Null: [ -0.1, 0.1 ]

plot(x) # this still needs work, but technically works
#> Registered S3 methods overwritten by 'ggplot2':
#>   method         from 
#>   [.quosures     rlang
#>   c.quosures     rlang
#>   print.quosures rlang


(x <- bayesfactor_restrict(posterior,prior, hypothesis = c("X > .1 | X < -.1","-.1 < X & X < .1")))
#> # Bayes Factor (Order-Restriction)
#> 
#>        Hypothesis P(Prior) P(Posterior) Bayes Factor
#>  X > .1 | X < -.1    0.912        0.903         0.99
#>  -.1 < X & X < .1    0.088        0.097          1.1
#> ---
#> Bayes factors for the restricted movel vs. the un-restricted model.

x$BF[1] / x$BF[2] # this is in theory the same as the interval null, but computed differently.
#> [1] 0.8980288
# nice to see that it (somewhat) converges (sanity check)

Created on 2019-06-22 by the reprex package (v0.3.0)

Technically the interval-null is a probability ratio, and not a savage dickey density ratio... But it feels like these belong together, as their function purpose are the same - compute BFs based on prior and posterior distributions (not marginal likelihoods)... but so does bayesfactor_restrict, and it does get a separate function because the "what" the is being tested is different (but as you see above, in some cases not too different)... Any ideas how to deal with all of this? Simply specify in the docs? Make a separate function for the interval test? Make an aliased function/s?

DominiqueMakowski commented 5 years ago

That looks great though!

Technically the interval-null is a probability ratio, and not a savage dickey density ratio... But it feels like these belong together, as their function purpose are the same

I tend to agree. I think it's a good idea to regroup functions based on their usage/aim/output similarity rather than technical differences. However, maybe we should rename/alias savagedickey to something more generic and less abstract... Like bayesfactor_direct, bayesfactor_simple, bayesfactor_priorposterior, bayesfactor_ratio...

but so does bayesfactor_restrict, and it does get a separate function because the "what" the is being tested is different

I am not sure, could you please re-clarify the difference between the BF SD for a null interval and the BF restricted for the posterior outside vs. inside this interval?

Additional questions/thoughts:

mattansb commented 5 years ago

I think what is in common to both the sdBF and the interval-null is that they test univariate prior and posterior distributions. (Does that give any inspiration for names?) They both also always give some BFalternative vs. null (or BF10 for short). (But note that if the a directional test via direction it gives BFdirection-restricted vs. null).

On the other hand, bayesfactor_restrict (probably bayesfactor_restricted is better) can be used to restrict multivariate prior and posterior distributions, and always gives a BFrestricted model vs. un-restricted model (or BFr1 for short). For example, the point-null says A=B=C, and the alternative (unrestricted) says A≠B≠C, but we can also test an order restricted model of A>B>C, or A>B & C>B, etc... So that BFr1 tells us by how much more are the data probable under the restricted model vs. the unrestricted model. Richard Morey has a really good blog post explaining all of this.

Would it make sense to add a mix between BF and ROPE (a BF based on the ROPE interval) to our comparison?

In essence, the interval-null is exactly that! In practice, someone could pass the rope-range as the null argument!

How are both types related?

Due to the transitive property of the BF, we can get from one to the other in some cases, like in the example I gave above. Let's name some models:

Thus,

Due to the transitive property of the BF, BF32 = BF31 / BF21!

(This is also exactly the point Morey and Rouder (2011) make in their paper... Our interval-null test is what they call there the "Nonoverlapping Hypotheses").

strengejacke commented 5 years ago

This looks really impressive! We should think about how we can keep the arguments as consistent as possible, so users can almost intuively switch between the different BF-methods.

Quick comments:

1) Yes, bayesfactor_restricted() sounds better 2) Do we really need hypothesis = c("X > .1 | X < -.1","-.1 < X & X < .1"), or can we simply call that argument rope, allowing a numeric vector of length 2? By that, we have automatically define the inside/outside-range. 3) If we need 2), but rope would also be possible, maybe we can have both, so users can't choose what they prefer? 4) We should probably add some details to the BF-vignettes, which explains the differences (both technically like ratio of prior/post. dist, not marg. lik. and practically, i.e. when to use which BF) 5) We need to modify the plotting-functions in see, or at least add another plot-method

mattansb commented 5 years ago

@strengejacke we commented at the same time, and I think I answered some of your questions there, but just so I make sure all you points have been addressed:

  1. Agree.
  2. hypothesis = c("X > .1 | X < -.1","-.1 < X & X < .1") was only one example I used to show how the restricted BF was similar to the interval-null BF. Normally this will not be the usage of the bayesfactor_restricted, it is less suitable for testing interval restrictions, and is more suitable for testing order restrictions (like what I gave in the first example).
  3. The "rope BF" is the bayesfactor_savagedickey with the interval-null.
  4. Absolutely. Once I get all the functions right I will update the vignette with all of this info.
  5. Same as 4.
strengejacke commented 5 years ago

I'm not sure how much more we can add to such a plot, except maybe the shaded area for the null?

image

image

mattansb commented 5 years ago

I thought adding an alpha aesthetic (but the shading works and is more simple!) and removing the dots.

-- Mattan S. Ben-Shachar, PhD student Department of Psychology & Zlotowski Center for Neuroscience Ben-Gurion University of the Negev The Developmental ERP Lab

On Sun, Jun 23, 2019, 11:04 Daniel notifications@github.com wrote:

I'm not sure how much more we can add to such a plot, except maybe the shaded area for the null?

[image: image] https://user-images.githubusercontent.com/26301769/59973292-45354080-959e-11e9-9f9d-2943d6286701.png

[image: image] https://user-images.githubusercontent.com/26301769/59973295-4d8d7b80-959e-11e9-84f0-7678fa1500e9.png

β€” You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/easystats/bayestestR/issues/177?email_source=notifications&email_token=AINRP6G5KPYINENLGN663E3P34VCHA5CNFSM4HY5U7B2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYKZAHI#issuecomment-504729629, or mute the thread https://github.com/notifications/unsubscribe-auth/AINRP6DZTV2TXPUWMCQRXHTP34VCHANCNFSM4HY5U7BQ .

DominiqueMakowski commented 5 years ago

removing the dots.

:+1:

would it make sense to add the rope range as a default as null interval?

mattansb commented 5 years ago

would it make sense to add the rope range as a default as null interval?

It would if the interval-null BF is it's own function - right now the default in the bayesfactor_savagedickey is the point null...

DominiqueMakowski commented 5 years ago

It would if the interval-null BF is it's own function - right now the default in the bayesfactor_savagedickey is the point null...

right in that case let's keep it as is but maybe let put 'rope' as argument to hypohesis in which cases it will automatically run rope range?

mattansb commented 5 years ago

How about a general function for univariate testing (bayesfactor_simple \ bayesfactor_univariate \ bayesfactor_null (?)), where:

Any other name suggestions are welcome ^_^

strengejacke commented 5 years ago

I'm not sure I get the multivariate thing for the restricted BF. Do you have an example, including specifying the hypothesis, for a model with multiple variables?

mattansb commented 5 years ago

Sure. Here is the example from Richard Morey's blog post:

Say you have 3 experimental condition, and you want a BF to see if they affect some DV. So you build two models and compare them:

library(rstanarm)
library(bayestestR)
library(emmeans)

options(contrasts=c('contr.bayes', 'contr.bayes')) # from `bfrms` package

# replicating http://bayesfactor.blogspot.com/2015/01/multiple-comparisons-with-bayesfactor-2.html
disgust_data <- read.table(url('http://www.learnbayes.org/disgust_example.txt'),header=TRUE)

fit0 <- stan_glm(score ~ 1, data = disgust_data, family = gaussian(),
                 diagnostic_file = file.path(tempdir(), "df0.csv"),
                 prior = cauchy(0,0.5),
                 prior_aux = exponential(.1))
fit1 <- stan_glm(score ~ condition, data = disgust_data, family = gaussian(),
                 diagnostic_file = file.path(tempdir(), "df1.csv"),
                 prior = cauchy(0,0.5),
                 prior_aux = exponential(.1))
bayesfactor_models(fit0,fit1)
#> Computation of Bayes factors: estimating marginal likelihood, please wait...
#> Bayes factor analysis
#> ---------------------                     
#> [2] condition    0.26
#> 
#> Against denominator:
#>       [1] (Intercept only)   
#> ---
#> Bayes factor type:  marginal likelihoods (bridgesampling) 

We get a BF of 3.9 in favor of the null. But if you have some more specific hypothesis, such as that the conditions should be in some kind of order, and not just "different" from each other, the alternative tested above does not test that!

With order restrictions we can test several order restrictions at once (instead of testing A > B and then separately testing B > C - here we can test A > B > C in one BF!)

em_condition <- emmeans(fit1, ~ condition)
hyps <- c("lemon < control & control < sulfur") #this is one hypothesis made of several order restrictions.

bayesfactor_restrict(em_condition, prior = fit1, hypothesis = hyps)
#> Computation of Bayes factors: sampling priors, please wait...
#> # Bayes Factor (Order-Restriction)
#> 
#>                          Hypothesis P(Prior) P(Posterior) Bayes Factor
#>  lemon < control & control < sulfur     0.19         0.74         3.99
#> ---
#> Bayes factors for the restricted movel vs. the un-restricted model.

Note that this BF is for the restricted model against the unrestricted model: BFr1, meaning the data are 3.99 more probable under a model with the order restriction than with the model without any restriction! If we want BFr0, we can get it with BFr1 BF10 = 3.99 0.26 = 1.03. Thus the data is equally probable under a model with no effect at all for condition or a model with a specific order restriction.

In other words, if you have a hypothesis say that the effect of X is positive, and that the difference between A1 and A2 is negative, you can get a single BF representing a single model that has these (order) restrictions...

DominiqueMakowski commented 5 years ago

naive question: is it strongly different from a "regular" bf against the bull based on the marginal contrasts (e. g. obtained via emmeans)?

mattansb commented 5 years ago

On the marginal means, you would get 2 BFs with different alt and nulls:

I am not aware of any way to incorporate these two BFs into 1...

strengejacke commented 5 years ago

instead of testing A > B and then separately testing B > C - here we can test A > B > C in one BF!

I think this is clear. So this is again something slightly different than testing if an effects is outside a ROPE, say A < -.5 & A > .5?

Thus the data is equally probable under a model with no effect at all for condition or a model with a specific order restriction.

I don't quite get this. Isn't a model without any restriction the same as no effect at all for condition, so here we have either BF = 3.99 or BF= 1 / 3.99 ?

mattansb commented 5 years ago

A model with no effect is a model with a restriction, the restriction of: theta = 0!

An unrestricted model is the full model, where the parameters are free to be any value. In such an unrestricted model, we can have any of the following orders (though some are more probable than other a posteriori, they are all equally probable a priori): A > B > C, A > C > B, B > A >C, B > C > A, C > A > B, C > B > A.

bayesfactor_restricted computes the marginal likelihood of an order-restricted model, in which only some orders of the parameters are "allowed" a priori, and compares it the marginal likelihood of the unrestricted model. So you get BFr1 and not BFr0 (which you can get by: BFr0 = BFr1 / BF10, but that will over complicate the function, and we have a function the does BF10 - bayesfactor_models).

In theory you can also test interval restrictions with bayesfactor_restricted, but because of the way bayesfactor_restricted works, and the fact that it returns BFr1, it is better to use the interval-null option in the sdBF function (where these interval restrictions are testing with a logspline probability function, while bayesfactor_restricted just "counts" how many samples from the prior and posterior draws meet the order restriction).

Summary - model types

BFs types (and the model types they correspond to)

I hope that's more clear? This is definitely the hardest Bayes factor to grasp... I had to read the blog post and both papers several times, and I only really got the gist when I started writing BFEffects::restrict some time ago...

strengejacke commented 5 years ago

I thought adding an alpha aesthetic

Error in f(...) : Aesthetics can not vary with a ribbon
mattansb commented 5 years ago

That settles that then πŸ˜…

strengejacke commented 5 years ago

I think what we can add to the docs is a one-sentence summary for all BF methods we have. Correct me if I'm wrong, but what I understood is:

I think that it's very helpful for users to decide which BF is the most approprioate for their planned analyses or the question they wish to answer.

One question that remains is probably: Why do I need inclusion BF, e.g. to test if it makes sense to include an interaction term, if I can test a model with against a model w/o that interaction term?

mattansb commented 5 years ago

You would want this in cases where you're not really interested in a specific model, but you are interested in seeing how effect X does across several models. Personally I don't really use this, but it is a popular feature of JASP, so...

strengejacke commented 5 years ago

Does it make sense to have a plot-method for BF restricted? I'm currently not sure how this could look like.

mattansb commented 5 years ago

Don't think so... at best it can be passed to the BF-models' pizza plot? Meh..

DominiqueMakowski commented 5 years ago

Great work πŸ₯‡

I fear like the challenge now will become to show the users how/when these different versions can be relevant for their questions. Which means robust documentation and in the long run, posts demonstrating some use cases. But I don't worry about it at all, it'll come with time. As we ourselves become familiar with these tools/indices and we start using them for our questions, it will help us deliver interesting and useful examples ☺️

Don't think so... at best it can be passed to the BF-models' pizza plot? Meh..

πŸ•πŸ‘

mattansb commented 5 years ago

Is it possible to add BF-restricted to the general bayesfactor function...

I think so. I've added this to my to-do list (above).

The various docs will need to be heavily expanded indeed... I think it is best to let this PR cook slowly, until the functions are well documented (also with proper names), with explanations and examples (which are a major feature of our β€πŸ“¦).

Generally I don't think we're in a hurry with these two BFs - the major BF functions have already been implemented (well), are well documented, and these new BFs are very (very) niche.

strengejacke commented 5 years ago

True! but...

and these new BFs are very (very) niche.

... and we can be one of the firsts to implement it! :-)

strengejacke commented 5 years ago

Over the course of the past weeks, I started liking bayes factors. One reason is for sure the nice implementation in bayestestR and see, but - apart from critism especially from Bayesian statisticians - I think these are reasonable additions to the classical NHST in frequentist frameworks. As far as I am concerned, I'm not completely devoted to either the Bayesian or Frequentist side of statistics, so having all these tools (also in our other packages) is really great!

DominiqueMakowski commented 5 years ago

I think it is best to let this PR cook slowly

Beware, Daniel's blood will be boiling seeing this unmerged PR hanging in there πŸ˜…

Over the course of the past weeks, I started liking bayes factors.

a big hats off to @mattansb for successfully shifting @strengejacke's posteriors :)

But, I tend to agree :) however, I am still very curious regarding these impressive precepts in terms of sampling (e.g., "at least 10000 draws to have accurate BFs"). I am still wondering whether the accuracy for regular models (e.g. rstanarm's 4000 default) is this much off or is it, in the end, quite usable...

strengejacke commented 5 years ago

I am still wondering whether the accuracy for regular models (e.g. rstanarm's 4000 default) is this much off or is it, in the end, quite usable...

Looks like another paper... There's much content we can generate from this package ;-)

mattansb commented 5 years ago

... and we can be one of the firsts to implement it! :-)

Actually... the order restrictions have recently been implemented in JASP (but I didn't know this when I started working on our function), and the null interval can, with some work, be implemented in brms... but obviously ours is better!

Over the course of the past weeks, I started liking bayes factors.

My work here is done! 😜

"at least 10000 draws to have accurate BFs"

Nope, this was a mistake on my part - 4000 seems to be fine for normal usage. Lets say you need 200 draws per order restriction (for a 0.5% precision) (one order restriction: A >B, two: A > B & C > D), that mean you could have up to 200 (orthogonal!) order restrictions, which is likely to never ever happen (BF = 7*10^42). I will amend this in the doc ^_^

mattansb commented 5 years ago

I think after the functions here as solid, I will go one to .. basically re-write (or at least re-structure) the BF vignette into the following structure (putting this here for some feedback & so I don't forget):

  1. General Intro to BFs (framing BFs as any change from prior odds to posterior odds, not just as likelihood-ratios).
  2. Univariate BFs 2.1. Null-Interval BFs (+ "one sided tests") 2.2. sdBF - Show how it is equal to the null-interval as the interval shrinks to a point.
  3. Comparing models 3.1. marginal likelihood (=evidence) and more (BIC approx.) 3.2. model averaging (inclusion BFs) 3.3. order restricted models

(Still holding out for function names. bayesfactor_univariate? bayesfactor_parameter [vs. bayesfactor_models]?)

mattansb commented 5 years ago

I think after the functions here as solid, I will go one to .. basically re-write (or at least re-structure) the BF vignette into the following structure (putting this here for some feedback & so I don't forget):

  1. General Intro to BFs (framing BFs as any change from prior odds to posterior odds, not just as likelihood-ratios).
  2. Univariate BFs 2.1. Null-Interval BFs (+ "one sided tests") 2.2. sdBF - Show how it is equal to the null-interval as the interval shrinks to a point.
  3. Comparing models 3.1. marginal likelihood (=evidence) and more (BIC approx.) 3.2. model averaging (inclusion BFs) 3.3. order restricted models

(Still holding out for function names. bayesfactor_univariate? bayesfactor_parameter [vs. bayesfactor_models]?)

strengejacke commented 5 years ago

Sounds good! :-)

DominiqueMakowski commented 5 years ago

bayesfactor_parameters vs. bayesfactor_models is nice and simple!

Although bayesfactor inclusion are also about parameters (giving the "average" BF for each parameter), and yet the way it works is closer to bayesfactor_models...

mattansb commented 5 years ago

BF inclusion tests terms - so a factor will get a single inclusion BF even if has, in the model, 10 parameters. I will make sure this is clear in the (updated) vignette.

Ok, so we'll go with bayesfactor_parameters:

mattansb commented 5 years ago

I'm closing this as all major points have been addressed. Finer points can be discussed at #194