kassambara / rstatix

Pipe-friendly Framework for Basic Statistical Tests in R
https://rpkgs.datanovia.com/rstatix/
445 stars 51 forks source link

Group add_xy_position for facetted plots #56

Closed phrenicooesophageale closed 4 years ago

phrenicooesophageale commented 4 years ago

`library(dplyr) library(tidyr) library(ggpubr) library(rstatix) df <- ToothGrowth df$dose <- as.factor(df$dose) df$group <- factor(rep(c("grp1", "grp2"), 30)) head(df, 3) df[1, 1] = 500 df[3,1] = 495 df[5,1]=505 stat <- df%>% group_by(supp, group)%>% tukey_hsd(len ~ dose)%>% add_xy_position()

p <- ggbarplot(df, x = "dose", y = "len", add = c("mean_se"), facet.by = c("supp", "group"), scales = "free")+ stat_pvalue_manual(stat, label = "p.adj.signif",tip.length = 0.01, hide.ns = TRUE )+ scale_y_continuous(expand = expansion(mult = c(0.01, 0.1))) p` image

If you have one value in your rstatix that is quite a lot bigger than the rest of them, the y-position of the pvalue-bracket gets adjusted in all facets, even though scales=free. At least in the OJ facets the pvalue brackets should appear just on top of the barplot. Is there any way to group add_xy_position() to avoid this?

kassambara commented 4 years ago

The default of the function add_xy_position() is to automatically compute a global step increase value between brackets. This calculation assumes that the y scales of plot panels are fixed.

In the situation, where you want free scales, you can:

  1. Set the option step.increase to 0 when calling the function add_xy_position().
  2. Specify only the option step.increase in the function stat_pvalue_manual(). In this case, the step.increase will be adapted to each plot panel.

Please install the latest github dev version of rstatix and ggpubr:

devtools::install_github("kassambara/rstatix")
devtools::install_github("kassambara/ggpubr")

then, try the following R code:

suppressPackageStartupMessages(library(ggpubr))
suppressPackageStartupMessages(library(rstatix))

# Data
#%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
# Transform `dose` into factor variable
df <- ToothGrowth
df$dose <- as.factor(df$dose)
# Add a random grouping variable
df$group <- factor(rep(c("grp1", "grp2"), 30))
# Add some extremely high values in column 1 at rows c(1, 3, 5).
df[c(1, 3, 5),  1] <- c(500, 495, 505)

# Statistical test
#%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
stat.test <- df %>%
  group_by(group, supp) %>%
  tukey_hsd(len ~ dose) 
stat.test 
#> # A tibble: 12 x 11
#>    supp  group term  group1 group2 null.value estimate conf.low conf.high
#>  * <fct> <fct> <chr> <chr>  <chr>       <dbl>    <dbl>    <dbl>     <dbl>
#>  1 OJ    grp1  dose  0.5    1               0     6.32    0.275     12.4 
#>  2 OJ    grp1  dose  0.5    2               0    11.3     5.26      17.3 
#>  3 OJ    grp1  dose  1      2               0     4.98   -1.06      11.0 
#>  4 VC    grp1  dose  0.5    1               0  -286.   -548.       -23.5 
#>  5 VC    grp1  dose  0.5    2               0  -276.   -539.       -14.0 
#>  6 VC    grp1  dose  1      2               0     9.46 -253.       272.  
#>  7 OJ    grp2  dose  0.5    1               0    12.6     6.23      19.0 
#>  8 OJ    grp2  dose  0.5    2               0    14.4     7.97      20.7 
#>  9 OJ    grp2  dose  1      2               0     1.74   -4.65       8.13
#> 10 VC    grp2  dose  0.5    1               0     7.12    0.917     13.3 
#> 11 VC    grp2  dose  0.5    2               0    16.4    10.2       22.6 
#> 12 VC    grp2  dose  1      2               0     9.28    3.08      15.5 
#> # … with 2 more variables: p.adj <dbl>, p.adj.signif <chr>

# Plot
#%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
# Note: you need to specify "mean_se" in both add_xy_position() and ggbarplot()
stat.test <- stat.test %>% 
  add_xy_position(x = "dose", fun = "mean_se", step.increase = 0)

bp <- ggbarplot(
  df, x = "dose", y = "len", fill = "#00AFBB", add = "mean_se",
  facet.by = c("supp", "group"), scales = "free"
) 
bp +
  stat_pvalue_manual(stat.test, hide.ns = TRUE, tip.length = 0, step.increase = 0.2) +
  scale_y_continuous(expand = expansion(mult = c(0.05, 0.15)))

Created on 2020-06-28 by the reprex package (v0.3.0.9001)

Read more at: Add P-values to GGPLOT Facets with Different Scales.

phrenicooesophageale commented 4 years ago

Amazing support, thank you kassambara!

clmr413 commented 3 months ago

Hi,

While working on the same problem as mentioned on this post, I noticed that in some cases the y.position of the pvalue appears to be wrongly calculated. It looks exactly like the fourth plot (grp2, vc) where the boxplots itself are scrunched together and the pvalues float somewhere above. Is there a way to avoid this?

Thank you very much in advance!