Closed brianmsm closed 3 months ago
Hi @brianmsm ,
I disagree with your assessment - as I see it, the issue does not lie with V but with w; As we point out in out paper, Cohen's w is generally not analogous to a correlation coefficient since it does not have an upper bound of 1.
In fact, for 2D contingency tables Cohen's w has an upper bound of $\sqrt{r-1}$, and we reflect this when constructing one-sided CIs for Cohen's w:
data("Music_preferences", package = "effectsize")
sqrt(min(dim(Music_preferences)) - 1)
#> [1] 1.414214
effectsize::cohens_w(Music_preferences)
#> Cohen's w | 95% CI
#> ------------------------
#> 0.34 | [0.27, 1.41]
#>
#> - One-sided CIs: upper bound fixed at [1.41~].
In this sense, Cramer's V (and Tschuprow’s T) are better analogs to a correlation coefficient, as they have an upper bound of 1, and they (together with פ (Fei)) are all normalized- $\chi$ values (normalized to the maximal $\chi$ as per the test/design).
Hi @mattansb . Thank you very much for the reply and illustration. I have carefully read the paper you indicated and reviewed again chapter 7 of Cohen (1988) that deals with it. Certainly Cohen's W has a changing upper bound as a function of $\sqrt{r - 1}$, so the interpretations around its effect size is changing. See this table on page 222: It would appear that it is Cramer's V that needs to be adjusted to value its effect size as a function of the ratings indicated for Cohen's W, when it should be the opposite, that Cohen's W should be adjusted to Cramer's V scale if it wanted to value that coefficient as a function of the effect size adjectives: small, medium and large. Although, the description provided by Cohen seems to indicate the opposite, or at least, I was misunderstanding it.
So, the current behavior of interpret_cramers_v()
is correct.
I've added a note in the docs:
Thank you @mattansb ! Is it possible to also add the indication that Cramer's V needs no previous transformation and works and interprets in the same way regardless of the degrees of freedom it comes from? I'm not sure if it seems unnecessary, but maybe more people have that misunderstanding with the use of Cramer's V
I feel like that would raise more questions than be helpful.
Describe the bug Currently the Cramer's V interpretation function is an alias of the correlation interpretation (
interpret_r()
). This is correct as long as we work with degrees of freedom ($r-1$) equal to 1, where the same interpretation as Cohen's w index is used.Taking as a reference equation 7.2.6 (Cohen, 1988, pp. 223):
$$ V = \sqrt{\frac{\chi^2}{N(r - 1)}} = \frac{w}{\sqrt{r - 1}} $$
In a 2x2 table, where $r = 2$:
$$ V = \frac{w}{\sqrt{2 - 1}} = w $$
However, when you have an estimate of Cramer's V with a different degrees of freedom configuration, what
interpret_cramers_v()
provides will not be correct.Note: The degrees of freedom ($r-1$) here refer to the number of categories of the variable with the least amount of them, minus 1. For example, in a 3x4 table, the degrees of freedom would be 3-1 = 2.
Expected behavior I thought of 3 ways this could be addressed:
Keep the function as is and instruct the user in the documentation to transform the Cramer's V coefficient into Cohen's W, to use the function
interpret_cramers_v()
.Created on 2024-06-19 with reprex v2.1.0
Modify the function to include as an argument the specification of degrees of freedom:
Modify the function to include as an argument the row and column specification similar to
v_to_w()
.Personally, I like the second option better.
Specifiations (please complete the following information):
R
Version 4.4.0effectsize
Version 0.8.1