tidyverse / ggplot2

An implementation of the Grammar of Graphics in R
https://ggplot2.tidyverse.org
Other
6.45k stars 2.02k forks source link

Axis alignment over multiple panels #5826

Closed teunbrand closed 1 week ago

teunbrand commented 5 months ago

This is a proof-of-concept PR exploring to fix #5820.

Briefly, this PR adds extra spacers to the axis gtable that have 'null' units so that the spacers take up the 'available' space. This works to align labels across panels as 'available space' is flexible and adapts to the size of the cell in the plot's gtable, which is always* based on the largest axis.

* See caveat below

devtools::load_all("~/packages/ggplot2")
#> ℹ Loading ggplot2

data.frame(
  F = c(rep("F1", 3), rep("F2", 3)),
  Y = c("AAAAAAAAAAAAAAAAAAA", "BBBBB", "CCCCCCCCC", "DDDDD", "EEEEEEEE", "FFF"),
  Cnt = c(10, 20, 30, 50, 40 ,60)
) |>
  ggplot(mapping = aes(y = Y, x = Cnt)) +
  geom_col() +
  facet_grid(
    rows = vars(F),
    scales = "free_y"
  ) +
  theme(
    axis.text.y = element_text(hjust = 0)
  )

I consider this PR a POC because this solution is not perfect. In some circumstances, the plot's gtable cell size allocated to axes is smaller than the actual axis size, in particular when facet_wrap() has ragged panels that need to be labelled on the ragged ends. In the plot below, notice that the right axis of the bottom-middle panel is misplaced and also the bottom axis of the top-right panel is misplaced.

# The only failing unit test is this plot
ggplot(mtcars, aes(mpg, disp)) +
  geom_point() +
  guides(x = "axis", y = "axis", x.sec = "axis", y.sec = "axis") + 
  facet_wrap(vars(cyl, vs), axes = "all", axis.labels = "margins")

Created on 2024-04-04 with reprex v2.1.0

However, the axis doesn't know when it will be placed in smaller cells and installing the plumbing to let the axis know would require tempering with already very complicated code that places back axes. Unless I can find some trick to mitigate this problem, I'm stuck at this 90% solution, which is why I wouldn't merge this PR for now.

teunbrand commented 3 months ago

TODO: Benchmark time for regular plots

teunbrand commented 3 months ago

Just for clarity, the plot that failed before no longer fails and gives a nice layout:

devtools::load_all("~/packages/ggplot2")
#> ℹ Loading ggplot2

ggplot(mtcars, aes(mpg, disp)) +
  geom_point() +
  guides(x = "axis", y = "axis", x.sec = "axis", y.sec = "axis") + 
  facet_wrap(vars(cyl, vs), axes = "all", axis.labels = "margins")

Created on 2024-05-21 with reprex v2.1.0

teunbrand commented 3 months ago

For benchmarks I performed the following on the current main branch:

devtools::load_all("~/packages/ggplot2")
#> ℹ Loading ggplot2

plot <- ggplot(mpg, aes(displ, hwy)) +
  geom_point() +
  facet_wrap(vars(cyl, year))

build <- ggplot_build(plot)
gt <- ggplot_gtable(build)

tmp <- tempfile(fileext = ".pdf")
pdf(tmp)

render <- function(table) {
  grid.newpage()
  grid.draw(table)
}

bench::mark(
  build  = ggplot_build(plot),
  table  = ggplot_gtable(build),
  render = render(gt), 
  min_iterations = 10, 
  check = FALSE
)
#> # A tibble: 3 × 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 build        50.8ms   51.7ms      19.3    2.18MB     8.29
#> 2 table        97.1ms   97.5ms      10.3    4.45MB    15.4 
#> 3 render       48.1ms   49.2ms      20.3  655.27KB     8.68

Created on 2024-05-21 with reprex v2.1.0

And the repeated that exact code for this PR to get the following result:

#> # A tibble: 3 × 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 build        50.6ms   51.9ms      19.2    2.18MB     8.22
#> 2 table        97.8ms   98.7ms      10.1    4.45MB    15.2 
#> 3 render       50.6ms   51.5ms      19.4  655.27KB     8.32

I think the 1-2 ms cost is bearable.