csdaw / ggprism

ggplot2 extension inspired by GraphPad Prism
https://csdaw.github.io/ggprism/
169 stars 21 forks source link

step.increase in add_y_position() is bad #12

Open n-osennij opened 2 years ago

n-osennij commented 2 years ago

I generate a lot of graphs of the same type. And they all have a different scale along the y-axis. And the step.increase default value is not suitable, because the lines on the graph with the p-value stick together. If I change the step.increase standard value, then the lines, on the contrary, diverge too much.

stat.test <- total %>% rstatix::wilcox_test(as.formula(paste(col_names[col_i], "group", sep="~" ))) %>% rstatix::add_y_position()
stat.test$y.position[3] = stat.test$y.position[1]

5    BDNF_99__27721798 6    BDNF_92__27721791 7    HTR2A_06__47471705 12    FKBP5_38__35558438 13    FKBP5_88__35558488 16    TRKB__87283470

csdaw commented 2 years ago

Without a reproducible example it's a little hard to know that exact problem. And ultimately I think this is an issue with the rstatix::add_y_position() function so you might like to raise an issue in the rstatix repo instead.

However, I propose a solution below. Hopefully this helps.

library(ggplot2)

# PROBLEM
# make fake grouped data with similar means
set.seed(2022)
df1 <- data.frame(
  x = rep(paste0("group", 1:3), each = 100),
  y = rnorm(300, mean = 0, sd = 5)
)

# perform the stat test and move right bracket down
stat.test1 <- rstatix::wilcox_test(df1, y ~ x)
stat.test1 <- rstatix::add_y_position(stat.test1)
stat.test1$y.position[3] = stat.test1$y.position[1]

# plot (brackets are a bit low but spacing is ok)
ggplot(df1, aes(x = x, y = y)) + 
  geom_boxplot(fill = "grey80") + 
  geom_jitter() + 
  theme_bw() + 
  ggprism::add_pvalue(stat.test1)

# make fake grouped data with different means
df2 <- data.frame(
  x = rep(paste0("group", 1:3), each = 100),
  y = c(rnorm(100, mean = 0, sd = 5), rnorm(200, mean = 30, sd = 5))
)

# perform the stat test and move right bracket down
stat.test2 <- rstatix::wilcox_test(df2, y ~ x)
stat.test2 <- rstatix::add_y_position(stat.test2)
stat.test2$y.position[3] = stat.test2$y.position[1]

# plot (brackets are high enough but spacing is too far)
ggplot(df2, aes(x = x, y = y)) + 
  geom_boxplot(fill = "grey80") + 
  geom_jitter() + 
  theme_bw() + 
  ggprism::add_pvalue(stat.test2)


# SOLUTION:
# write a function to produce more reasonable y positions
make_ypos <- function(df) {
  max_y <- max(df$y) # get highest point
  min_y <- min(df$y) # get lowest point
  range_multiplier <- (max_y - min_y) / 10 # get 10% of the y axis range

  # output the bracket y positions
  # (bottom left, top, bottom right)
  out <- c(max_y + range_multiplier, max_y + range_multiplier * 1.5, max_y + range_multiplier)
  out
}

# override y.position with a vector of your own calculated values
# works well in both example situations
ggplot(df1, aes(x = x, y = y)) + 
  geom_boxplot(fill = "grey80") + 
  geom_jitter() + 
  theme_bw() + 
  ggprism::add_pvalue(stat.test1, y.position = make_ypos(df1))

ggplot(df2, aes(x = x, y = y)) + 
  geom_boxplot(fill = "grey80") + 
  geom_jitter() + 
  theme_bw() + 
  ggprism::add_pvalue(stat.test2, y.position = make_ypos(df2))

Created on 2022-01-17 by the reprex package (v2.0.1)