Closed IndrajeetPatil closed 4 years ago
I'm not sure, but I would say the p value refers to a different effect size.
Or better: different test statistic. Since some CIs are bootstrapped, the test statistic is no longer suitable
Taking your summary, the values for partial Omega squared and the p-values seem to be ok:
(2 * (9.637 - 1.346)) / (2 * 9.637 + 240 * 1.346)
pf(7.160, df1 = 2, df2 = 229, ncp = F, lower.tail = F)
It's just that the 95% quantiles of the bootstrapped values for omega squared indeed include the zero. I'm not sure how you would "match" these discrepancies? The bootstrapped CI are not equal tailed around the estimate - maybe it's better to get the bootstrapped SE and then calculate CI's based on normal distribution...
Yeah, I think it will be better to calculate CI based on a normal distribution, or at least have an option to choose the sampling distribution. All effect size confidence intervals in my package default to using normal distribution (https://github.com/IndrajeetPatil/ggstatsplot/blob/master/R/helpers_effsize_ci.R), which means there is usually a good correspondence between traditional p-values and bootstrapped CIs. Of course, users still have the option to choose what kind of CIs they want usign conf.type
argument.
Therefore, it's super-surprising to me that, for example, the term mpaa
has a p-value of 0.0009
and still includes 0
in its confidence intervals!
Taking the bootstrapped SE or computing SE based on CI makes it even worse - the lower bound of the normally distributed CI's is much lower than the bootstrapped CI (because CI's are not equal tailed).
I'm not sure if it would make sense to re-calculate the p-value for omega-squared, because then you have a significant term for your "normal" model and a non-significant term when you compute omega-squared.
Maybe you have an idea? As I typically don't use Anova, I'm not sure how to proceed here, except for mentioning in the docs that CI's based on bootstrapping may indicate in "non-significance", while the model does not.
Sorry, I haven't been much of help; I am out of my statistical depth here. I will try to do some digging on when and why there are discrepancies between bootstrapped CIs and traditional p-values.
For a comparison, output from effectsize
:
# for reprducibility
set.seed(123)
library(ggstatsplot)
library(effectsize)
# to speed up the calculation, let's use only 10% of the data
movies_10 <-
dplyr::sample_frac(tbl = ggstatsplot::movies_long, size = 0.1)
# `aov` object
stats.object <- stats::aov(formula = rating ~ mpaa * genre,
data = movies_10)
effectsize::omega_squared(model = stats.object,
partial = TRUE)
#> Parameter | Omega_Sq_partial | 90% CI
#> ----------------------------------------------
#> mpaa | -0.01 | [-0.01, 0.01]
#> genre | 0.10 | [-0.02, 0.17]
#> mpaa:genre | 0.00 | [-0.10, 0.00]
Several notes:
There is a bug in
sjstats::omega_sq()
. Even though p-values for all terms in the followinganova
object are less than 0.05, the confidence intervals for partial omega-squared include 0.