const-ae / ggsignif

Easily add significance brackets to your ggplots
https://const-ae.github.io/ggsignif/
GNU General Public License v3.0
593 stars 43 forks source link

geom_signif - all comparisons dissappear when one comparison has missings #126

Open MPietzke opened 1 year ago

MPietzke commented 1 year ago

Initially posting it at ggpubr (https://github.com/kassambara/ggpubr/issues/503) however this is just a shameless wrapper for geom_signif() - so maybe it's better suited here!?

When using geom_signif() to make multiple comparisons it works fine, until one of the comparisons cannot be performed (e.g. due to too many missings). In this case also all the possible comparisons dissappear! Please see this example:

# A dataset with some NAs 
dataset = tibble(
  "Sample" = rep(c("Sample1", "Sample2"), each = 15),
  "Cond"   = rep(c("A", "B", "C",
                   "A", "B", "C"), each = 5),
  "Rep"    = rep(1:5, 6),
  "Value"  = c(runif(5, 10, 12),  #A1
               runif(5, 11, 14),  #B1
               runif(5, 10, 13),  #C1
               runif(5, 10, 12),  #A2
               runif(5, 11, 14),  #B2
               c(runif(2, 10, 13), NA, NA, NA) #C2
  ))

# With min 2 datapoints we see all the comparisons we want to have!
ggplot(dataset, 
       aes(x = Cond, y = Value, 
           colour = as.factor(Cond),
           fill = as.factor(Cond) )) + 
  geom_jitter(size = 5, width = 0.2, alpha = 0.3, stroke = 1.5,
              shape = 21) + 
  stat_summary(fun.min = mean, fun.max = mean, size = 1.5,                
               geom='errorbar') + 
  facet_wrap( ~ Sample) +
  theme_bw()  + scale_y_continuous(limits = c(0, 16)) +
  geom_signif(comparisons = list(c("A", "B"),
                                 c("B", "C")),
              step_increase = 0.2,
              colour = "black") + 
  theme(legend.position = "none")

image

with only NAs in one of the conditions (C), the other comparisons (A-B) dissappers as well!

ggplot(data = filter(dataset, Rep >= 3), 
       aes(x = Cond, y = Value, 
           colour = as.factor(Cond),
           fill = as.factor(Cond) )) + 
  geom_jitter(size = 5, width = 0.2, alpha = 0.3, stroke = 1.5,
              shape = 21) + 
  stat_summary(fun.min = mean, fun.max = mean, size = 1,                
               geom='errorbar') + 
  facet_wrap( ~ Sample) +
  theme_bw()  + scale_y_continuous(limits = c(0, 16)) +
  geom_signif(comparisons = list(c("A", "B"),
                                 c("B", "C")),
              step_increase = 0.2,
              colour = "black") + 
  theme(legend.position = "none")

image

Here also the comparison A-B get lost, even though this can still be calculated. One could adapt the comparisons made (after seeing it's not working in one of the cases) but in general I want to have a consistent picture over multiple (usually more than just 2) Samples .

It throws a warning, so at least the function allready know something fails: 1: Removed 3 rows containing non-finite values (stat_summary). 2: Removed 3 rows containing non-finite values (stat_signif). 3: Computation failed in stat_signif(): not enough 'y' observations.

Would it be possible to:

This would be awesome!

PS: Just reading the proDA paper - then adding the issue here and noticing the identical name of the author!

const-ae commented 1 year ago

Hey, thank you for the kind words and the well written bug report. As you probably have already noticed, I am currently not on top of my Github issues and don't have the capacity to invest time to add features in ggsignif. You can of course write a PR to fix the issue and we will take a look and consider if we can merge it.

Best, Constantin

murpholinox commented 1 year ago

Same here! ...adding a comment to be notified ...