Closed IndrajeetPatil closed 6 years ago
By the way this exposes a philosophical difference. I have a disdain for all the partials be it R, eta or omega the interpretation is much less clean and too many people use them as justification for strengths they are hoping to justify.
Now that the bug is identified and logged after you clean up the set seed issue I'm happy to write a test that avoids the path that leads to the bug.
Cool, will let you know when this issue is resolved on sjstats
front and then you can add the remaining tests.
Choice to default to partials was based on this review and the recommendations therein: https://www.frontiersin.org/articles/10.3389/fpsyg.2013.00863/full
Right I get why the partials have to be used for power and meta across studies but that same experimental design is a real show stopper. It's okay since you offer both. Just giving you insight as to why I quickly found that bug.
Dani does a great job or arguing the counterpoint here in her book (Learning Statistics with R) starting on page 513 (real page not pdf page).
I'm not sure how to address this issue, except excluding anova
-objects. The call to anova()
returns a data frame, not a classical model object.
data(mtcars)
m <- anova(lm(mpg ~ hp * wt, mtcars))
formula(m)
#> Df ~ `Sum Sq` + `Mean Sq` + `F value` + `Pr(>F)`
str(m)
#> Classes 'anova' and 'data.frame': 4 obs. of 5 variables:
#> $ Df : int 1 1 1 28
#> $ Sum Sq : num 678.4 252.6 65.3 129.8
#> $ Mean Sq: num 678.37 252.63 65.29 4.63
#> $ F value: num 146.4 54.5 14.1 NA
#> $ Pr(>F) : num 1.23e-12 4.86e-08 8.11e-04 NA
#> - attr(*, "heading")= chr "Analysis of Variance Table\n" "Response: mpg"
Created on 2018-10-05 by the reprex package (v0.2.1)
The problem is that I need the formula for boot strapping...
I tend to agree that dropping it as an input object type makes sense. I always think of anova() as just a nice little method for producing "pretty" output (classic ANOVA table) from the upstream methods like aov() and lm() that do the real work.
@strengejacke I am a bit confused about why this works with sjstats::omega_sq
then. Does CI computation for omega-squared not need the formula for bootstrapping?
sjstats::omega_sq(
model = anova(lm(mpg ~ hp * wt, mtcars)),
partial = FALSE,
ci.lvl = 0.95 ,
n = 100
)
#> term omegasq conf.low conf.high
#> 1 hp 0.596 0.686 0.886
#> 2 wt 0.219 0.398 0.761
#> 3 hp:wt 0.054 0.069 0.522
Also, to be backward compatible, I don't think it would be a good idea to remove support for anova
objects because even with these objects the function fails in a very specific context. One possibility is that when the entered object is anova
&& partial = FALSE
&& is.null(ci.lvl)
is FALSE
, then the columns conf.low
and conf.high
in the output can be filled with NA
s and produce a warning
. What do you think?
omega_sq()
uses bootstrapping if partial = TRUE
, and then again fails for anova
-objects. I would only give a warning for those cases where anova
does not work, not completely remove the support.
Just to be clear, you'll need two conditionals since it fails with
eta_sq
and partial = FALSE
+ omega_sq
and partial = TRUE
# Constant model, ci.lvl, and n
# Eta first -- varying only the partial parameter
sjstats::eta_sq(
model = anova(aov(mpg ~ hp * wt, mtcars)),
partial = TRUE,
ci.lvl = 0.95 ,
n = 100
)
#> term partial.etasq conf.low conf.high
#> 1 hp 0.839 0.699 0.892
#> 2 wt 0.661 0.414 0.773
#> 3 hp:wt 0.335 0.073 0.539
sjstats::eta_sq(
model = anova(aov(mpg ~ hp * wt, mtcars)),
partial = FALSE,
ci.lvl = 0.95 ,
n = 100
)
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Error in mutate_impl(.data, dots): Evaluation error: Result 2 is not a length 1 atomic vector.
# Omega next -- varying only the partial parameter
sjstats::omega_sq(
model = anova(aov(mpg ~ hp * wt, mtcars)),
partial = TRUE,
ci.lvl = 0.95 ,
n = 100
)
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Error in mutate_impl(.data, dots): Evaluation error: Result 2 is not a length 1 atomic vector.
sjstats::omega_sq(
model = anova(aov(mpg ~ hp * wt, mtcars)),
partial = FALSE,
ci.lvl = 0.95 ,
n = 100
)
#> term omegasq conf.low conf.high
#> 1 hp 0.596 0.686 0.886
#> 2 wt 0.219 0.398 0.761
#> 3 hp:wt 0.054 0.069 0.522
Created on 2018-10-05 by the reprex package (v0.2.1)
Yes, I just need to check in the bootstrap-function that the object has a formula and is no data frame. Else, computing CI via bootstrap fails.
Thank you. By the way this reminds me of another pet peeve of mine statistically. This is an unbalanced design so there are at least 3 (actually 4) ways of computing sums of squares. By choosing to use default aov
or lm
without adjusting the contrasts you're actually generating the results that are least likely to be of interest to the researcher (type 1).
This article with references does a great job of laying it out:
References [1] John Fox. “Applied Regression Analysis and Generlized Linear Models”, 2nd ed., Sage, 2008.
[2] David G. Herr. “On the History of ANOVA in Unbalanced, Factorial Designs: The First 30 Years”, The American Statistician, Vol. 40, No. 4, pp. 265-270, 1986.
[3] Oyvind Langsrud. “ANOVA for unbalanced data: Use Type II instead of Type III sums of squares”, Statistics and Computing, Volume 13, Number 2, pp. 163-167, 2003.
[4] Ista Zahn. “Working with unbalanced cell sizes in multiple regression with categorical predictors”, 2009. prometheus.scp.rochester.edu/zlab/sites/default/files/InteractionsAndTypesOfSS.pdf
More specifically to R using the car::Anova() with type 2 might be a better default if the design is unbalanced.
To get a better sense of just how widely and wildly the results might vary let me should you for just eta squared and partial eta squared using the lsr
package...
# install.packages(lsr)
lsr::etaSquared(aov(mpg ~ hp * wt, mtcars), type = 1)
#> eta.sq eta.sq.part
#> hp 0.60243734 0.8394308
#> wt 0.22434811 0.6606549
#> hp:wt 0.05797826 0.3347193
# order matters for type 1!!!!
lsr::etaSquared(aov(mpg ~ wt * hp, mtcars), type = 1)
#> eta.sq eta.sq.part
#> wt 0.75283279 0.8672499
#> hp 0.07395266 0.3908931
#> wt:hp 0.05797826 0.3347193
# recommended is type 2 or 3
lsr::etaSquared(aov(mpg ~ hp * wt, mtcars), type = 2)
#> eta.sq eta.sq.part
#> hp 0.07395266 0.3908931
#> wt 0.22434811 0.6606549
#> hp:wt 0.05797826 0.3347193
lsr::etaSquared(aov(mpg ~ wt * hp, mtcars), type = 2)
#> eta.sq eta.sq.part
#> wt 0.22434811 0.6606549
#> hp 0.07395266 0.3908931
#> wt:hp 0.05797826 0.3347193
lsr::etaSquared(aov(mpg ~ hp * wt, mtcars), type = 3)
#> eta.sq eta.sq.part
#> hp 0.09731877 0.4578520
#> wt 0.17234960 0.5992978
#> hp:wt 0.05797826 0.3347193
lsr::etaSquared(aov(mpg ~ wt * hp, mtcars), type = 3)
#> eta.sq eta.sq.part
#> wt 0.17234960 0.5992978
#> hp 0.09731877 0.4578520
#> wt:hp 0.05797826 0.3347193
Created on 2018-10-05 by the reprex package (v0.2.1)
Thanks for listening. Chuck
Hmm, good point. I was thinking that this can be avoided on the user side by using car::Anova()
model object, but even that suffers from the same issue raised in this thread.
library(car)
#> Loading required package: carData
# creating a model
mod <- stats::lm(formula = mpg ~ hp * wt,
data = mtcars)
sjstats::eta_sq(car::Anova(mod, type = "II"),
partial = FALSE,
ci.lvl = 0.95)
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Error in mutate_impl(.data, dots): Evaluation error: Result 2 is not a length 1 atomic vector.
sjstats::eta_sq(car::Anova(mod, type = "III"),
partial = FALSE,
ci.lvl = 0.95)
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Warning in anova.lm(model): ANOVA F-tests on an essentially perfect fit are
#> unreliable
#> Error in mutate_impl(.data, dots): Evaluation error: Result 2 is not a length 1 atomic vector.
Created on 2018-10-05 by the reprex package (v0.2.1)
This issue is not observed for
sjstats::omega_sq
or whenpartial = TRUE
forsjstats::eta_sq
.Created on 2018-10-04 by the reprex package (v0.2.1)