kassambara / ggpubr

'ggplot2' Based Publication Ready Plots
https://rpkgs.datanovia.com/ggpubr/
1.13k stars 165 forks source link

p-values different in t.test through stat_compare_means and t.test in R #44

Closed cchien1 closed 6 years ago

cchien1 commented 7 years ago

Hi, I am not sure what kind of t-test is performed in ggpubr. It is not a big deal, however, I needed to get the t-statistic for these tests on multiple plots. This is not a function in stat_compare_means so I tried to perform the t.test function (Welch Two Sample t-test) which gave me different p-values than what was printed in ggpubr.

  1. Is there a way to print the t-statistic through ggpubr?

  2. Is there a way to use the Welch Two Sample t-test in ggpubr?

Thanks so much! Looking forward to hearing from you.

Claudia

kassambara commented 7 years ago

Hi,

By default, ggpubr performs a wilcoxon test.

For a t.test, specify the argument method = "t.test":

# Load data
data("ToothGrowth")
head(ToothGrowth)
p <- ggboxplot(ToothGrowth, x = "supp", y = "len",
               color = "supp", palette = "npg")
p + stat_compare_means(method = "t.test")

rplot

As you can see below, the output of ggpubr is the Welch t-test:

t.test(len ~ supp, data = ToothGrowth)

    Welch Two Sample t-test

data:  len by supp
t = 1.9153, df = 55.309, p-value = 0.06063
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.1710156  7.5710156
sample estimates:
mean in group OJ mean in group VC 
        20.66333         16.96333 

Please, make sure that you have the latest developmental version of ggpubr.

cchien1 commented 7 years ago

Thanks so much for your help... I don't know why I still don't have this.. I was selecting "t.test". Does it change if you use facet_wrap with a different variable? m_dftest <- DF %>% gather("Method","Value",c("M","C","T") ) %>% mutate(Method = factor(Method, levels = c("M","C","T") ) )

m_group <-ggplot(data = m_dftest, aes(x=Group, y=Value, alpha=0.3)) + geom_boxplot() + labs(x="",y="") + geom_quasirandom(dodge.width=.9) + facet_wrap( ~ Method, scales = "free", ncol = 3) m_group + stat_compare_means(method = "t.test", label = "p.format", label.x = 1.25)

problem

t.test(subset(DF, Group=="1")$M, subset(DF, Group=="2")$M)

Welch Two Sample t-test

data: subset(DF, Group == "1")$M and subset(DF, Group == "2")$M

t = 3.4153, df = 44.907, p-value = 0.001362 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 2.599328 10.072851 sample estimates: mean of x mean of y 76.32192 69.98583

kassambara commented 7 years ago

Your script is not reproducible as it doen't contain any demo data set, so that I can check the issue.

Please specify the argument facet.by in the function ggboxplot().

library(ggpubr)
p <- ggboxplot(ToothGrowth, x = "supp", y = "len",
               color = "supp", palette = "jco",
               add = "jitter",
               facet.by = "dose", short.panel.labs = FALSE)

p + stat_compare_means(method = "t.test", label = "p.format")

rplot

subset(ToothGrowth, dose == "0.5") %>%
  t.test(len ~ supp, data = .)

    Welch Two Sample t-test

data:  len by supp
t = 3.1697, df = 14.969, p-value = 0.006359
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 1.719057 8.780943
sample estimates:
mean in group OJ mean in group VC 
           13.23             7.98 
compare_means(len ~ supp, data = ToothGrowth,
              group.by = "dose",
              method = "t.test")
# A tibble: 3 x 9
   dose   .y. group1 group2           p       p.adj p.format p.signif method
                               
1   0.5   len     OJ     VC 0.006358607 0.012717214   0.0064       ** T-test
2   1.0   len     OJ     VC 0.001038376 0.003115128   0.0010       ** T-test
3   2.0   len     OJ     VC 0.963851589 0.963851589   0.9639       ns T-test