easystats / effectsize

:dragon: Compute and work with indices of effect size and standardized parameters
https://easystats.github.io/effectsize/
Other
337 stars 23 forks source link

About `interpret_cramers_v()` and its degrees of freedom #647

Closed brianmsm closed 3 months ago

brianmsm commented 3 months ago

Describe the bug Currently the Cramer's V interpretation function is an alias of the correlation interpretation (interpret_r()). This is correct as long as we work with degrees of freedom ($r-1$) equal to 1, where the same interpretation as Cohen's w index is used.

Taking as a reference equation 7.2.6 (Cohen, 1988, pp. 223):

$$ V = \sqrt{\frac{\chi^2}{N(r - 1)}} = \frac{w}{\sqrt{r - 1}} $$

In a 2x2 table, where $r = 2$:

$$ V = \frac{w}{\sqrt{2 - 1}} = w $$

However, when you have an estimate of Cramer's V with a different degrees of freedom configuration, what interpret_cramers_v() provides will not be correct.

Note: The degrees of freedom ($r-1$) here refer to the number of categories of the variable with the least amount of them, minus 1. For example, in a 3x4 table, the degrees of freedom would be 3-1 = 2.

Expected behavior I thought of 3 ways this could be addressed:

  1. Keep the function as is and instruct the user in the documentation to transform the Cramer's V coefficient into Cohen's W, to use the function interpret_cramers_v() .

    library(effectsize)
    
    (w1 <- 0.3*sqrt(2-1))
    #> [1] 0.3
    (w2 <- v_to_w(0.3, nrow = 3, ncol = 4))
    #> [1] 0.4242641
    
    interpret_cramers_v(w1, 
                        rules = "cohen1988")
    #> [1] "moderate"
    #> (Rules: cohen1988)
    
    interpret_cramers_v(w2, 
                        rules = "cohen1988")
    #> [1] "moderate"
    #> (Rules: cohen1988)

    Created on 2024-06-19 with reprex v2.1.0

  2. Modify the function to include as an argument the specification of degrees of freedom:

    interpret_cramers_v(0.3, df = 2, rules = "cohen1988")
  3. Modify the function to include as an argument the row and column specification similar to v_to_w().

    interpret_cramers_v(0.3, nrow = 3, ncol = 4, rules = "cohen1988")

Personally, I like the second option better.

Specifiations (please complete the following information):

mattansb commented 3 months ago

Hi @brianmsm ,

I disagree with your assessment - as I see it, the issue does not lie with V but with w; As we point out in out paper, Cohen's w is generally not analogous to a correlation coefficient since it does not have an upper bound of 1.

In fact, for 2D contingency tables Cohen's w has an upper bound of $\sqrt{r-1}$, and we reflect this when constructing one-sided CIs for Cohen's w:

data("Music_preferences", package = "effectsize")

sqrt(min(dim(Music_preferences)) - 1)
#> [1] 1.414214

effectsize::cohens_w(Music_preferences)
#> Cohen's w |       95% CI
#> ------------------------
#> 0.34      | [0.27, 1.41]
#> 
#> - One-sided CIs: upper bound fixed at [1.41~].

In this sense, Cramer's V (and Tschuprow’s T) are better analogs to a correlation coefficient, as they have an upper bound of 1, and they (together with פ (Fei)) are all normalized- $\chi$ values (normalized to the maximal $\chi$ as per the test/design).

brianmsm commented 3 months ago

Hi @mattansb . Thank you very much for the reply and illustration. I have carefully read the paper you indicated and reviewed again chapter 7 of Cohen (1988) that deals with it. Certainly Cohen's W has a changing upper bound as a function of $\sqrt{r - 1}$, so the interpretations around its effect size is changing. See this table on page 222: image It would appear that it is Cramer's V that needs to be adjusted to value its effect size as a function of the ratings indicated for Cohen's W, when it should be the opposite, that Cohen's W should be adjusted to Cramer's V scale if it wanted to value that coefficient as a function of the effect size adjectives: small, medium and large. Although, the description provided by Cohen seems to indicate the opposite, or at least, I was misunderstanding it.

So, the current behavior of interpret_cramers_v() is correct.

mattansb commented 3 months ago

I've added a note in the docs:

image

brianmsm commented 3 months ago

Thank you @mattansb ! Is it possible to also add the indication that Cramer's V needs no previous transformation and works and interprets in the same way regardless of the degrees of freedom it comes from? I'm not sure if it seems unnecessary, but maybe more people have that misunderstanding with the use of Cramer's V

mattansb commented 3 months ago

I feel like that would raise more questions than be helpful.