easystats / effectsize

:dragon: Compute and work with indices of effect size and standardized parameters
https://easystats.github.io/effectsize/
Other
338 stars 24 forks source link

Effect sizes in Bayesian regressions #8

Closed DominiqueMakowski closed 4 years ago

DominiqueMakowski commented 5 years ago

Opening a new thread as we strayed from the topic of #7

this correspondence between test statistics and indices of effect size is quite convenient in some cases... What would be the alternative for Bayesian models??

Unfortunately, there isn't any... It seems like any effect sized in a Bayes framework actually needs to be computed from the posteriors (even when it's simple, it's hard... good luck...). Generally, there seems to be very little work on standardized effect sizes in Bayesian modeling. This is probably because up until recently most users of Bayes were high-level stats / modeling users who ~felt they were above the dumb social sci needs~ were not interested in standardized anything.

But I read some tweet (or maybe on the JASP forum?) that the JASP team is working on some partial eta square / beta stuff...

EDIT: found it: https://onlinelibrary.wiley.com/doi/full/10.1111/stan.12173

Originally posted by @mattansb in https://github.com/easystats/effectsize/issues/7#issuecomment-541341954

DominiqueMakowski commented 5 years ago

Marsman's paper seems very interesting (unfortunately I didn't manage to through the wall of equations 😞)

If I understand, the only existing alternative for Bayesian regressions (currently in the work in this package) is a smart standardization of the posterior that would approximate (partial) correlations or some standardized differences?

DominiqueMakowski commented 5 years ago

@mattansb I've implemented some approximation to test stat for Bayesian models. Not sure if it makes a lot of sense from a theoretical perspective but seems to work from an applied one 😅

https://easystats.github.io/effectsize/articles/bayesian_models.html

The idea is to get t from coef and posterior SD, find the frequentist DoF (😱) and get the r posterior... Seems almost to easy

mattansb commented 5 years ago

I it's time for a dev branch so we can discuss commits...

DominiqueMakowski commented 5 years ago

👍

DominiqueMakowski commented 5 years ago

dev is up

mattansb commented 5 years ago

Okay, so I'll split my comment into de facto and de jure:

de facto

You are doing:

se <- sd(posterior)
t <- posterior / se
df <- from_freq_magic(...)

convert_t_to_r(t, df)

This will probably almost always give results very close to the true results....

Except...

How does this function deal with multiple dfs (I can't seem to re-build the package to check)? Like in mixed models? If it cannot return the approx-correct df, the calculation will be not be a good approx...

de jure

The idea of degrees of freedom in a Bayesian framework is almost heresy! That is - df represent the ~"number of bits of data free to change that can still result in this estimate". But in Bayes the estimates aren't based only on the data - but also on the prior (and thus also on previous data, and so forth). (Also there isn't 1 estimate, but a whole distribution.)

So in the most technical sense, this is wrong 😅

But see de facto above 😉

DominiqueMakowski commented 5 years ago

How does this function deal with multiple dfs

Here lies the problem. Currently, I added this function in parameters, and for mixed models the most straigthforward and quick method is to 1) refit the model using lme4 (see here) and 2) use the kenward-roger approximations (same as for p values for lme4). It's clearly not the best way to tackle this.

The idea of degrees of freedom in a Bayesian framework is almost heresy!

Haha I'd bet this hybrid approach would trigger all the bayestremists 😅 As a pragmaticist I'd tend to say "as long as it serves better science" but it's true that best would be to also be theoretically correct... But as we said, is there currently any better alternatives out there?

Another thing I am wondering is for non-linear models, e.g., logistic regression. The freq models usually report a z value. But as we only have access to the sample SD for the posteriors (same as for linear model), does it make sense to still compute t and then t to r or should we use z values?

mattansb commented 5 years ago

Another thing I am wondering is for non-linear models

For these models there's not need for all this "black magic" because they have OR / log odds / rate etc. as the model parameters! Actually, this problem of effect sizes is only a problem in linear models (because you need to estimate the variance explained and the error variance).

is there currently any better alternatives out there?

posterior_predict used like this can give some form of variance explained, but it's not really partialized (so it's interpretation is not straightforward...)

DominiqueMakowski commented 5 years ago

For these models there's not need for all this "black magic" because they have OR / log odds / rate etc. as the model parameters!

Forgot about that haha

posterior_predict used like this can give some form of variance explained

Interesting... but not quite there yet 😞

DominiqueMakowski commented 5 years ago
library(parameters)
library(effectsize)
#> 
#> Attaching package: 'effectsize'
#> The following objects are masked from 'package:parameters':
#> 
#>     cohens_f, epsilon_squared, eta_squared, omega_squared

model <- glm(vs ~ cyl + disp + drat, data = mtcars, family = "binomial") 

parameters <- model_parameters(model)
#> Waiting for profiling to be done...
parameters$r <- convert_z_to_r(parameters$z, n = insight::n_obs(model))
parameters$r_from_odds <- convert_odds_to_r(parameters$Coefficient, log = TRUE)
parameters
#> Parameter   | Coefficient |    SE |         95% CI |     z | df |     p |     r | r_from_odds
#> ---------------------------------------------------------------------------------------------
#> (Intercept) |       25.09 | 11.88 | [ 6.43, 56.94] |  2.11 | 28 | < .05 |  0.35 |        0.99
#> cyl         |       -1.99 |  1.05 | [-4.57, -0.13] | -1.89 | 28 | 0.06  | -0.32 |       -0.48
#> disp        |       -0.01 |  0.02 | [-0.06,  0.02] | -0.47 | 28 | > .1  | -0.08 |        0.00
#> drat        |       -3.18 |  2.16 | [-8.70,  0.49] | -1.47 | 28 | > .1  | -0.25 |       -0.66

Created on 2019-10-13 by the reprex package (v0.3.0)

The two "r" do not match tho... any idea why?

mattansb commented 5 years ago

I think it's because in a multiple regression setting, z and log odds don't directly correspond.

In any case r doesn't make any sense in a non-Gaussian model. All of these "just because you can doesn't mean it makes sense" uses should be documented somewhere... Maybe drop convert_odds_to_r altogether? What users would probably want is to standardize coeffs here...

Also convert_F_to_r usually won't make sense - if df > 1 it would be multiple R, usually reported as R2, and then it's just partial eta squared.

Also also... The file names and contents for all these functions needs to be make concise - it took me way too long to find convert_z_to_r and convert_odds_to_r!

Also also also, @DominiqueMakowski this thread is about Bayes!

DominiqueMakowski commented 5 years ago

z and log odds don't directly correspond

That's strange... Which one gives the most "accurate" r/d? For instance in a metaanalysis if you want to convert everything to d, it's kinda problematic if two approaches give quite different values...

In any case r doesn't make any sense in a non-Gaussian model.

If you want to interpret it as variance explained sure, but it makes sense if you just want to have all your effects on the same scales... and possibility + good documentation > impossibility (as we discussed several times ^^)

it would be multiple R, usually reported as R2, and then it's just partial eta squared.

Since eta squared is equivalent to R2, which is a more generic name ("Eta Squared, however, is used specifically in ANOVA models.", here), wouldn't it be clearer to rename all those as t_to_r2 etc. (it looks better too)?

The file names and contents for all these functions needs to be make concise

Agreed

this thread is about Bayes!

U started! 😁

mattansb commented 5 years ago

Which one gives the most "accurate" r/d?

For d, the odds to d looks like the correct one.

but it makes sense if you just want to have all your effects on the same scales...

It "makes sense" to want that, but it doesn't make sense to get that in this method - you should instead use parameters_standardize.

wouldn't it be clearer to rename all those as t_to_r2 etc

I think it's a great idea to have !

F_to_partial_R2 <- F_to_partial_eta_squared
t_to_partial_R2 <- t_to_partial_eta_squared 

Then the convert_t_to_r could be:

convert_t_to_partial_r <- function(t, df_error) sign(t) * sqrt(t_to_partial_R2(t, df_error))

(I think it is important to have the "partial" in the names or at the very least the doc title - users shouldn't think they're getting the simple correlation.)


Another thing I am wondering is for non-linear models, e.g., logistic regression.

You stated!

mattansb commented 5 years ago

Going back to:

I've implemented some approximation to test stat for Bayesian models. Not sure if it makes a lot of sense from a theoretical perspective but seems to work from an applied one 😅

I said that de facto...

This will probably almost always give results very close to the true results....

I'd like to amend that: It will work in cases where the prior is relatively lax (i.e., not strong); in these cases SDposteriorSEfrequentist (because the posterior ≈ the likelihood function). But when this equality will be broken with strong priors, where SDposterior< SEfrequentist, which will lead to miss-estimating the effect size.

strengejacke commented 5 years ago

posterior_predict used like this can give some form of variance explained

Which is implemented in performance::icc() :-)

mattansb commented 5 years ago

@strengejacke interesting.... Care to take a stand at that here for gaussian models? (This will give a non-partial R2, if I understand correctly?)

mattansb commented 4 years ago

@DominiqueMakowski I suggest moving the related functions and vignette to a separate branch, as they don't yet fully work / ongoing debate here about how to do this, and It is possible at this time to get regular beta (standardized coefficients) for Bayesian models.

DominiqueMakowski commented 4 years ago

Yes I agree, or rename it as something WIP

mattansb commented 4 years ago

@DominiqueMakowski What is left to do here?

Standardized regression coeffs are available... Are we talking about an Eta2-like effect size?

DominiqueMakowski commented 4 years ago

I am talking about a consistent and robust method to get a unified ("similar") index for different kinds of models (regressions with factors, interactions, and logistic models) 😬

mattansb commented 4 years ago

Okay, I've come up with an idea.

  1. Sample from the ppd.
  2. For each sample: 2.1. fit a linear model 2.2 estimate the effect size
  3. Use this posterior distribution of effect sizes to get HDI etc.
library(rstanarm)
library(effectsize)

eta_squared_posterior <- function(model, partial = FALSE, nsamps = 1000, verbose = TRUE, ...) {
  if (!insight::model_info(model)$is_linear) {
    stop("Only applicable to linear models")
  }

  if (partial) {
    warning("Only support non partial PVE.")
  }

  # get ppd
  ppd <- rstantools::posterior_predict(model)
  nsamps <- min(nsamps, nrow(ppd))
  i <- sample(nrow(ppd), size = nsamps)
  ppd <- ppd[i,]

  # get model data
  f <- insight::find_formula(model)$conditional
  X <- insight::get_predictors(model)
  resp_name <- insight::find_response(model)

  if (verbose) {
    message("Sampleing effect size... This can take a while...")
  }
  res <- apply(ppd, 1, function(r) {
    # sampled outcome + predictors
    temp_dat <- X
    temp_dat[[resp_name]] <- r

    # fit a simple linear model
    temp_fit <- lm(f, temp_dat)

    # compute effect size
    ANOVA <- car::Anova(temp_fit, type = 3)
    es <- eta_squared(ANOVA, ci = NA, partial = FALSE)

    data.frame(t(setNames(es$Eta_Sq, es$Parameter)), check.names = F)
  })

  res <- do.call("rbind", res)
  return(res)
}

model <- stan_lmer(mpg ~ wt + qsec * factor(am) + (1|cyl),
                   data = mtcars, refresh = 0)
pp_eta2 <- eta_squared_posterior(model)
#> Sampleing effect size... This can take a while...

bayestestR::describe_posterior(pp_eta2)
#> # Description of Posterior Distributions
#> 
#> Parameter       | Median |         89% CI | pd |        89% ROPE | % in ROPE
#> ----------------------------------------------------------------------------
#> wt              |  0.408 | [0.178, 0.642] |  1 | [-0.100, 0.100] |     0.000
#> qsec            |  0.105 | [0.000, 0.276] |  1 | [-0.100, 0.100] |    53.984
#> factor(am)      |  0.024 | [0.000, 0.100] |  1 | [-0.100, 0.100] |   100.000
#> qsec:factor(am) |  0.036 | [0.000, 0.133] |  1 | [-0.100, 0.100] |    92.368

## Compare to:

library(magrittr)
lm(mpg ~ wt + qsec * factor(am), data = mtcars) %>%
  car::Anova(type = 3) %>%
  eta_squared(partial = FALSE)
#> Parameter       | Eta2 |       90% CI
#> -------------------------------------
#> wt              | 0.44 | [0.21, 0.60]
#> qsec            | 0.08 | [0.00, 0.27]
#> factor(am)      | 0.05 | [0.00, 0.21]
#> qsec:factor(am) | 0.07 | [0.00, 0.24]

Created on 2020-05-25 by the reprex package (v0.3.0)

For now, this only works with non-partial eta squared, so it gives the total (unique) variance explained. For now, I only use a sub-sample of the ppd to speed it up.

you can find it here: https://github.com/easystats/effectsize/blob/dev/WIP/eta_squared_posterior.R

DominiqueMakowski commented 4 years ago

finally some good news on this front, that looks super promissive!

So let me rephrase to see if I understood correctly:

Let's say you have a model y ~ x with 10 obs.

  1. for each of the 10 obs of y, you get the posterior distribution (of for instance 200 samples) of the prediction
  2. for each of these 200 samples, you fit a linear model predicting the predicted values with the original data (?)
  3. You get 200 samples of for instance eta2 for each parameter that you consider as your posterior distribution of eta2
mattansb commented 4 years ago

Exactly!

I will have to think if there is any way to make it faster, as it is currently pretty slow (working across each sample...).

mattansb commented 4 years ago

A collection of stan threads that seem to support the logic of my method:

Using the PPD

(I also asked here if my method is applicable. Waiting for a response...)

Using the posterior parameters

emmawang123123 commented 4 years ago

Can i use bayestestR to test the regression results of the Bayesian quantile model ?

mattansb commented 4 years ago

Hi @emmawang123123 - I suggest opening a new issue (maybe on the bayestestR repo) with some more details (:

emmawang123123 commented 4 years ago

thank you,let me try

------------------ 原始邮件 ------------------ 发件人: "Mattan S. Ben-Shachar"<notifications@github.com>; 发送时间: 2020年5月30日(星期六) 凌晨1:18 收件人: "easystats/effectsize"<effectsize@noreply.github.com>; 抄送: "西丽西亚"<1635823199@qq.com>;"Mention"<mention@noreply.github.com>; 主题: Re: [easystats/effectsize] Effect sizes in Bayesian regressions (#8)

Hi @emmawang123123 - I suggest opening a new issue (maybe on the bayestestR repo) with some more details (:

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

mattansb commented 4 years ago

107