kassambara / rstatix

Pipe-friendly Framework for Basic Statistical Tests in R
https://rpkgs.datanovia.com/rstatix/
444 stars 50 forks source link

double-checking p-values from Dunn test #50

Closed IndrajeetPatil closed 3 years ago

IndrajeetPatil commented 4 years ago

The p-values from dunn.test and rstatix don't match up, and I am not sure why there is this discrepancy. I also checked the same with popular GUI softwares like jamvoi and their p-values are the same as the ones outputed by dunn.test. So I thought I would raise this issue.

library(dunn.test)

invisible(capture.output(df <-
                 as.data.frame(dunn.test(
                   x = mtcars$wt,
                   g = as.factor(mtcars$cyl),
                   table = FALSE,
                   kw = FALSE,
                   label = FALSE,
                   alpha = 0.05,
                   method = "none"
                 )), 
                 file = NULL))

tibble::as_tibble(df)
#> # A tibble: 3 x 5
#>    chi2     Z           P  P.adjusted comparisons
#>   <dbl> <dbl>       <dbl>       <dbl> <fct>      
#> 1  22.8 -1.84 0.0332      0.0332      4 - 6      
#> 2  22.8 -4.76 0.000000988 0.000000988 4 - 8      
#> 3  22.8 -2.22 0.0132      0.0132      6 - 8

library(rstatix)
#> 
#> Attaching package: 'rstatix'
#> The following object is masked from 'package:stats':
#> 
#>     filter

dunn_test(mtcars, wt ~ cyl, p.adjust.method = "none")
#> # A tibble: 3 x 9
#>   .y.   group1 group2    n1    n2 statistic          p      p.adj p.adj.signif
#> * <chr> <chr>  <chr>  <int> <int>     <dbl>      <dbl>      <dbl> <chr>       
#> 1 wt    4      6         11     7      1.84 0.0663     0.0663     ns          
#> 2 wt    4      8         11    14      4.76 0.00000198 0.00000198 ****        
#> 3 wt    6      8          7    14      2.22 0.0263     0.0263     *

Created on 2020-05-28 by the reprex package (v0.3.0.9001)

Session info ``` r sessioninfo::session_info() #> - Session info --------------------------------------------------------------- #> setting value #> version R Under development (unstable) (2020-02-28 r77874) #> os Windows 10 x64 #> system x86_64, mingw32 #> ui RTerm #> language (EN) #> collate English_United States.1252 #> ctype English_United States.1252 #> tz Europe/Berlin #> date 2020-05-28 #> #> - Packages ------------------------------------------------------------------- #> package * version date lib source #> abind 1.4-5 2016-07-21 [1] CRAN (R 4.0.0) #> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.0.0) #> backports 1.1.7 2020-05-13 [1] CRAN (R 4.0.0) #> broom 0.7.0.9000 2020-05-26 [1] Github (tidymodels/broom@a56ab06) #> car 3.0-8 2020-05-21 [1] CRAN (R 4.0.0) #> carData 3.0-4 2020-05-22 [1] CRAN (R 4.0.0) #> cellranger 1.1.0 2016-07-27 [1] CRAN (R 4.0.0) #> cli 2.0.2 2020-02-28 [1] CRAN (R 4.0.0) #> crayon 1.3.4 2017-09-16 [1] CRAN (R 4.0.0) #> curl 4.3 2019-12-02 [1] CRAN (R 4.0.0) #> data.table 1.12.8 2019-12-09 [1] CRAN (R 4.0.0) #> digest 0.6.25 2020-02-23 [1] CRAN (R 4.0.0) #> dplyr 0.8.99.9003 2020-05-25 [1] Github (tidyverse/dplyr@735e6a2) #> dunn.test * 1.3.5 2017-10-27 [1] CRAN (R 4.0.0) #> ellipsis 0.3.1 2020-05-15 [1] CRAN (R 4.0.0) #> evaluate 0.14 2019-05-28 [1] CRAN (R 4.0.0) #> fansi 0.4.1 2020-01-08 [1] CRAN (R 4.0.0) #> forcats 0.5.0 2020-03-01 [1] CRAN (R 4.0.0) #> foreign 0.8-75 2020-01-20 [2] CRAN (R 4.0.0) #> fs 1.4.1 2020-04-04 [1] CRAN (R 4.0.0) #> generics 0.0.2 2018-11-29 [1] CRAN (R 4.0.0) #> glue 1.4.1 2020-05-13 [1] CRAN (R 4.0.0) #> haven 2.3.0 2020-05-24 [1] CRAN (R 4.0.0) #> highr 0.8 2019-03-20 [1] CRAN (R 4.0.0) #> hms 0.5.3 2020-01-08 [1] CRAN (R 4.0.0) #> htmltools 0.4.0 2019-10-04 [1] CRAN (R 4.0.0) #> knitr 1.28 2020-02-06 [1] CRAN (R 4.0.0) #> lifecycle 0.2.0.9000 2020-03-16 [1] Github (r-lib/lifecycle@355dcba) #> magrittr 1.5 2014-11-22 [1] CRAN (R 4.0.0) #> openxlsx 4.1.5 2020-05-06 [1] CRAN (R 4.0.0) #> pillar 1.4.4 2020-05-05 [1] CRAN (R 4.0.0) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.0.0) #> purrr 0.3.4 2020-04-17 [1] CRAN (R 4.0.0) #> R6 2.4.1 2019-11-12 [1] CRAN (R 4.0.0) #> Rcpp 1.0.4.6 2020-04-09 [1] CRAN (R 4.0.0) #> readxl 1.3.1 2019-03-13 [1] CRAN (R 4.0.0) #> reprex 0.3.0.9001 2020-03-25 [1] Github (tidyverse/reprex@a019cc4) #> rio 0.5.16 2018-11-26 [1] CRAN (R 4.0.0) #> rlang 0.4.6 2020-05-02 [1] CRAN (R 4.0.0) #> rmarkdown 2.1 2020-01-20 [1] CRAN (R 4.0.0) #> rstatix * 0.5.0 2020-04-28 [1] CRAN (R 4.0.0) #> rstudioapi 0.11 2020-02-07 [1] CRAN (R 4.0.0) #> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.0.0) #> stringi 1.4.6 2020-02-17 [1] CRAN (R 4.0.0) #> stringr 1.4.0 2019-02-10 [1] CRAN (R 4.0.0) #> styler 1.3.2.9000 2020-05-17 [1] Github (r-lib/styler@8dad103) #> tibble 3.0.1 2020-04-20 [1] CRAN (R 4.0.0) #> tidyr 1.1.0 2020-05-20 [1] CRAN (R 4.0.0) #> tidyselect 1.1.0 2020-05-11 [1] CRAN (R 4.0.0) #> utf8 1.1.4 2018-05-24 [1] CRAN (R 4.0.0) #> vctrs 0.3.0 2020-05-09 [1] Github (r-lib/vctrs@5b71d88) #> withr 2.2.0 2020-04-20 [1] CRAN (R 4.0.0) #> xfun 0.14 2020-05-20 [1] CRAN (R 4.0.0) #> yaml 2.2.1 2020-02-01 [1] CRAN (R 4.0.0) #> zip 2.0.4 2019-09-01 [1] CRAN (R 4.0.0) #> #> [1] C:/Users/inp099/Documents/R/win-library/4.0 #> [2] C:/Program Files/R/R-devel/library ```
kassambara commented 4 years ago

Is'nt this a duplicate of #20 ?

IndrajeetPatil commented 4 years ago

Yikes, sorry about that! Should have commented there.

But, I think the emphasis here is a bit different. I am no longer concerned about why the p-values are different if z-values are identical across softwares, but rather what should the default p-value output from rstatic should be. Compared to most other packages and GUI softwares, rstatix p-values stand out (they are twice the other outputs!), and so I thought you might want to reconsider them.

At any rate, I will close the issue and will leave it to your best judgment! Thanks for considering.

kassambara commented 4 years ago

The default of the dunn.test package and jamovi is to perform one-sided test, which is not the default of well known commercial softwares, such as SPSS and GraphPad ( see discussion here).

I think that we should keep performing two sided dunn test by default in rstatix, like SPSS and Graphpad.

But, obiviously I should update the description section of rstatix::dunn_test() to mentionne this discrepancy with the dunn.test package. So, let's keep this issue open until the update of the doc

kassambara commented 3 years ago

doc updated now, thanks