kassambara / rstatix

Pipe-friendly Framework for Basic Statistical Tests in R
https://rpkgs.datanovia.com/rstatix/
432 stars 51 forks source link

Difference between Kruskal_test and kruskal.test #182

Open jtbelcik opened 1 year ago

jtbelcik commented 1 year ago

I have a rather large dataset of characteristics that I'm looking to analyze for statistical differences between five groups. The characters are not continuous numbers as they're meristic characters of fish (e.g., scale counts, fin ray counts, etc.). Many of these characters are non-normal so I'm comparing them using the non-parametric Kruskal Wallis test. Past researchers have used this or ANOVA so I believe I'm using a correct test to analyze the differences between species correctly.

However, I don't know which version of the Kruskal test to use. When using the kruskal_test function from the rstatix package I get one answer, and when using the krustal.test function from the stats package I get a different answer. Likewise I get different answers when using the dunn_test function from the rstatix package and dunn.test function from the dunn.test package. The ones that are statistically different in one test are usually statistically different under the other test, but not always. Everything else between the two test results are the same (degrees of freedom, chi-squared value, n) it's just the p-value that different. I'm wondering what the difference between the functions/packages are that are driving this difference. Maybe it's just something I'm leaving out of the code block, but I'm not sure.

For reference here is an example of the code that I'm using for each test:

rstatix::kruskal_test(Variable~Species, data=df)

stats::kruskal.test(Variable~Species, data=df)

GegznaV commented 1 year ago

@jtbelcik, what do you mean, that p values are different? Within the margins of tolerance, I found no difference with iris example.

all.equal(
    rstatix::kruskal_test(Petal.Length ~ Species, data = iris)$p,
    stats::kruskal.test(  Petal.Length ~ Species, data = iris)$p.value
)
#> [1] TRUE

Created on 2023-05-15 with reprex v2.0.2

Please clarify and p[provide a reprex.