adding `interpret_cramers_v()`?

easystats / effectsize

:dragon: Compute and work with indices of effect size and standardized parameters

https://easystats.github.io/effectsize/

Other

337 stars 24 forks source link

adding `interpret_cramers_v()`? #347

Closed IndrajeetPatil closed 3 years ago

IndrajeetPatil commented 3 years ago

I would also like to work soon on adding support for chi-squared tests in report package, and we will need interpretation guidelines for Cramer's V before I can do that.

I have seen guidelines online (like the one in the link below), but I can't seem to find any references for these. https://www.ibm.com/docs/en/cognos-analytics/11.1.0?topic=terms-cramrs-v

What do you think?

DominiqueMakowski commented 3 years ago

Here are some more:

mattansb commented 3 years ago

Since Phi and V are equal to Pearson's r for a 22 xtab, I would suspect that it should have the same interpretation for the *magnitude\ of association.

IndrajeetPatil commented 3 years ago

Since Phi and V are equal to Pearson's r for a 2*2 xtab

But we should also pre-meditate and cover one-way tests and non-2*2-tests, no?

library(effectsize)

effectsize(chisq.test(mtcars$cyl))
#> Cramer's V |       95% CI
#> -------------------------
#> 0.05       | [0.00, 0.00]

effectsize(chisq.test(mtcars$am, mtcars$cyl))
#> Warning in chisq.test(mtcars$am, mtcars$cyl): Chi-squared approximation may be
#> incorrect
#> Cramer's V |       95% CI
#> -------------------------
#> 0.52       | [0.11, 0.85]

^{Created on 2021-06-13 by the reprex package (v2.0.0)}

mattansb commented 3 years ago

It's the same measure in all those cases, I'm just pointing the equality in a specific case, so I think V can be interpreted similarly in all cases (like Pearson's r). (But not phi, which in cases other than 2*2 can be larger than 1).

@bwiernik have any insight here?

bwiernik commented 3 years ago

Oh, I wrote a reply this morning that I must not have sent.

Yes, we can apply the same benchmarks for Cramer’s V and phi as for Pearson r. There is some nuance that distributions impact the maximum observable r (Oscar Olveira’s work here is great), and these are most noticeable for discrete variables. But these are fairly small concerns and as a first approximation, Pearson benchmarks are fine.

mattansb commented 3 years ago

👆

Do we even need a separate function? Will an alias do here?

DominiqueMakowski commented 3 years ago

I guess alias is fine, until some "specific" rules of thumb are published (if ever)