kassambara / rstatix

Pipe-friendly Framework for Basic Statistical Tests in R
https://rpkgs.datanovia.com/rstatix/
432 stars 51 forks source link

anova_test function gives partial eta squared despite setting effect.size = 'ges' #132

Open UhuEngelCrash opened 2 years ago

UhuEngelCrash commented 2 years ago

The anova_test function gives a partial eta squared despite the value of the argument effect.size. Regardless of whether you use "ges" or "both", partial eta squared is always output. Here is a discussion of this issue that I found on Stackoverflow: https://stackoverflow.com/questions/67907016/rstatix-package-anova-test-function-gives-partial-eta-squared-despite-setting-ef

c-hoffmann commented 1 year ago

Is there no fix for this yet? Or has this been fixed and not closed?

UhuEngelCrash commented 1 year ago

Unfortunately, the error has not yet been fixed. I am posting Stackoverflow's answer below. Perhaps the author feels encouraged to correct the error:

Answer

rstatix::anova_test seems to contain a mistake in the calculation! I would be very, very careful with this function.

Note that eta_sq is deprecated, and effectsize::eta_squared should be used.

Proper calculation

We have three SS values: 1.412238, 72.752431, and 28.003665. We can calculate the pes and ges:

pes: 1.412238 / (1.412238 + 28.003665) ges: 1.412238 / (1.412238 + 72.752431 + 28.003665)

anova_test

Under the hood, anova_test calls two functions for pes and ges calculation:

pes: rstatix:::add_partial_eta_squared ges: rstatix:::add_generalized_eta_squared

The pes calculation by anova_test

res.anova.summary$ANOVA %>% mutate(pes = .data$SSn/(.data$SSn + .data$SSd))

This indeed calculates the pes as we expect it to.

The ges calculation by anova_test

aov.table <- res.anova.summary$ANOVA aov.table %>% mutate(ges = .data$SSn/(.data$SSn + sum(unique(.data$SSd)) + obs.SSn1 - obs.SSn2))

Here, we run into a problem. This code seems blatantly incorrect. It just divides each sum of square value by itself + the residual sum of squares (28.004). That is the pes, not the ges.

You could contact the maintainer of the package (maintainer("rstatix")) or create a new issue for the rstatix package here.

c-hoffmann commented 1 year ago

interestingly, I can replicate the example issue, but in my own calculations (which are repeated measures ANOVAs though) I get different ges and pes. Not sure why.

mmejia23 commented 2 months ago

I'm reposting an answer I just posted in Stackoverflow: https://stackoverflow.com/questions/67907016/rstatix-package-anova-test-function-gives-partial-eta-squared-despite-setting-ef/78369716#78369716

This may not be an error.

As far as I understand, when there are only independent between-subject variables, and you don't specify that any of those are "observed" (or measured, i.e. non-manipulate), both pes and ges are equal.

See quote from Lakens (2013): "When all factors are manipulated between participants η2G and η2p are identical." https://www.frontiersin.org/journals/psychology/articles/10.3389/fpsyg.2013.00863/full

If you want the calculation mentioned here: ges: 1.412238 / (1.412238 + 72.752431 + 28.003665)

you should specify "Species" as an observed (measured) variable:

aov_ges <- iris %>% anova_test(Sepal.Length ~ Sepal.Width + Species,
                               detailed = T,
                               effect.size = "ges", observed = "Species")
get_anova_table(aov_ges)

ges without measured variable: 0.2811500

ges with Species as measure variable: 0.09804556

@c-hoffmann, but this does not apply when your independent variables are within-subjects. In this case, the ges is in fact smaller than the pes. I still don't understand what exactly goes in the denominator in this case. See Bakeman (2005), though: https://link.springer.com/article/10.3758/BF03192707