teunbrand / ggh4x

ggplot extension: options for tailored facets, multiple colourscales and miscellaneous
https://teunbrand.github.io/ggh4x/
Other
533 stars 32 forks source link

Issue with manual axis #60

Closed sbihorel closed 2 years ago

sbihorel commented 2 years ago

Hello,

Thank you very much for publishing this great package. There are really useful functions in it.

I was wondering if/how the guide_axis_manual function can be used when a color or any form of data grouping is applied to the data. The following piece of code plays on the example of geom_violin included in your documentation... I am not sure how to set the breaks correctly.

Thank you in advance for your time

library(ggplot2)
library(ggh4x)

df <- diamonds 
df$colorcat <- ifelse(df$color %in% c('D', 'E', 'F'), 'Cat 1', 'Cat2')

tab <- table(paste(df$cut, df$colorcat))

ggplot(df, aes(cut, price, color = colorcat)) +
  geom_violin() +
  guides(x.sec = guide_axis_manual(
    breaks = names(tab),
    labels = paste0('n = ', tab)
  ))
teunbrand commented 2 years ago

Hello,

Thanks for the kind words! In this case you can set the breaks to a numerical vector corresponding to the center positions of the violin. What the correct way to calculate this is, depends on the data and the geom/position involved. For this particular example you can subtract and add 0.225 to every integer position along the categories on the x-axis.

library(ggplot2)
library(ggh4x)

df <- diamonds 
df$colorcat <- ifelse(df$color %in% c('D', 'E', 'F'), 'Cat 1', 'Cat2')

tab <- table(paste(df$cut, df$colorcat))
breaks <- as.vector(outer(c(-0.225, 0.225), seq_len(nlevels(df$cut)), FUN = "+"))

ggplot(df, aes(cut, price, color = colorcat)) +
  geom_violin() +
  guides(x.sec = guide_axis_manual(
    breaks = breaks,
    labels = paste0('n = ', tab)
  ))

Created on 2021-12-20 by the reprex package (v2.0.1)

sbihorel commented 2 years ago

Thanks

This works... However, please allow me to follow up with a more complex case which more closely reflects my actual plot needs compared to my original oversimplified reprex. What would you suggest to use when facetting and free axis range are also applied?

library(ggplot2)
library(ggh4x)

df <- diamonds 
df$colorcat <- ifelse(df$color %in% c('D', 'E', 'F'), 'Cat 1', 'Cat2')
df$cutcat <- ifelse(df$cut %in% c('Fair', 'Good'), 'Cat A', 'Cat B')

tab <- table(paste(df$cut, df$colorcat, df$cutcat))
tab
breaks <- as.vector(outer(c(-0.225, 0.225), seq_len(nlevels(df$cut)), FUN = "+"))

ggplot(df, aes(cut, price, color = colorcat)) +
  geom_violin() +
  facet_wrap( . ~ cutcat, scales = 'free_x') +
  guides(x.sec = guide_axis_manual(
    breaks = breaks,
    labels = paste0('n = ', tab)
  ))

PS: I am grateful for your time and help. Several hours of unsuccessfully exploration of various break options (including numeric values) led me to post yesterday... I could not hack the ggh4x hack :D

teunbrand commented 2 years ago

The problem in this case is that guides() sets axes globally, whereas you'd need to manage the breaks on a per-panel basis. You could do this with secondary axes and facetted_pos_scales(), but the problem is that discrete axes don't accept secondary axes. Hence, the unwieldy solution that would work is to discretise the x-axis data first, then use the secondary axes to continuous scales to set the guide. This becomes complicated, but it is not undoable.

library(ggplot2)
#> Warning: package 'ggplot2' was built under R version 4.1.1
library(ggh4x)

df <- diamonds 
df$colorcat <- ifelse(df$color %in% c('D', 'E', 'F'), 'Cat 1', 'Cat2')
df$cutcat <- ifelse(df$cut %in% c('Fair', 'Good'), 'Cat A', 'Cat B')

tab <- table(paste(df$cut, df$colorcat, df$cutcat))
breaks <- as.vector(outer(c(-0.225, 0.225), seq_len(nlevels(df$cut)), FUN = "+"))

discretise <- function(x, group) {
  x <- split(x, group)
  x <- lapply(x, function(y) {
    match(y, levels(droplevels(y)))
  })
  x <- unsplit(x, group)
  x
}

g <- ggplot(df, aes(discretise(cut, cutcat), 
               price, color = colorcat,
               group = interaction(cut, colorcat))) +
  geom_violin() +
  facet_wrap( . ~ cutcat, scales = 'free_x') +
  facetted_pos_scales(
    x = list(
      cutcat == "Cat A" ~ scale_x_continuous(
        breaks = 1:2, name = NULL,
        labels = c("Fair", "Good"),
        sec.axis = dup_axis(
          guide = guide_axis_manual(
            breaks = as.vector(outer(c(-0.225, 0.225), 1:2, FUN = "+")),
            labels = paste0('n = ', tab[1:4])
          )
        )
      ),
      cutcat == "Cat B" ~ scale_x_continuous(
        breaks = 1:3, name = NULL,
        labels = c("Very Good", "Premium", "Ideal"),
        sec.axis = dup_axis(
          guide = guide_axis_manual(
            breaks = as.vector(outer(c(-0.225, 0.225), 1:3, FUN = "+")),
            labels = paste0("n = ", tab[-c(1:4)])
          )
        )
      )
    )
  )

Created on 2021-12-20 by the reprex package (v2.0.1)

sbihorel commented 2 years ago

Awesome! Thank you very much for your help

I will look into how to generalize this code example as I am working on a data agnostic plotting workflow.

sbihorel commented 2 years ago

Hi again,

I got your solution to work within my plotting framework... However, I have an issue displaying an X-axis title. The built object has the label in it but it is not displayed. Would you have a suggestion to bypass this issue?

library(ggplot2)
library(ggh4x)

df <- diamonds 
df$colorcat <- ifelse(df$color %in% c('D', 'E', 'F'), 'Cat 1', 'Cat2')
df$cutcat <- ifelse(df$cut %in% c('Fair', 'Good'), 'Cat A', 'Cat B')

tab <- table(paste(df$cut, df$colorcat, df$cutcat))
breaks <- as.vector(outer(c(-0.225, 0.225), seq_len(nlevels(df$cut)), FUN = "+"))

discretise <- function(x, group) {
  x <- split(x, group)
  x <- lapply(x, function(y) {
    match(y, levels(droplevels(y)))
  })
  x <- unsplit(x, group)
  x
}

g <- ggplot(df, aes(discretise(cut, cutcat), 
               price, color = colorcat,
               group = interaction(cut, colorcat))) +
  geom_violin() +
  ylab('Some axis title') +
  facet_wrap( . ~ cutcat, scales = 'free_x') +
  facetted_pos_scales(
    x = list(
      cutcat == "Cat A" ~ scale_x_continuous(
        breaks = 1:2, name = NULL,
        labels = c("Fair", "Good"),
        sec.axis = dup_axis(
          guide = guide_axis_manual(
            breaks = as.vector(outer(c(-0.225, 0.225), 1:2, FUN = "+")),
            labels = paste0('n = ', tab[1:4])
          )
        )
      ),
      cutcat == "Cat B" ~ scale_x_continuous(
        breaks = 1:3, name = NULL,
        labels = c("Very Good", "Premium", "Ideal"),
        sec.axis = dup_axis(
          guide = guide_axis_manual(
            breaks = as.vector(outer(c(-0.225, 0.225), 1:3, FUN = "+")),
            labels = paste0("n = ", tab[-c(1:4)])
          )
        )
      )
    )
  )
teunbrand commented 2 years ago

Yes, simply remove the name = NULL from the scale_x_continuous(), and the axis title should be displayed. If you want to hide the secondary axis title, you can add title = NULL inside the guide_axis_manual() function or set name = NULL in the dup_axis() function.

sbihorel commented 2 years ago

Great !

Thank you very much for your help

teunbrand commented 2 years ago

If everything seems to work as expected, I'll close this issue for now.