slowkow / ggrepel

:round_pushpin: Repel overlapping text labels away from each other in your ggplot2 figures.
https://ggrepel.slowkow.com
GNU General Public License v3.0
1.22k stars 96 forks source link

Support for {gridtext} #249

Open teunbrand opened 9 months ago

teunbrand commented 9 months ago

This PR aims to fix #169.

Briefly, this PR gives geom_text_repel() a grob argument that can be used to provide a different grob constructor than textGrob(). You can pass, for example, a gridtext::richtext_grob() to draw the labels.

devtools::load_all("~/packages/gridtext/") # explained later why I'm using devtools here
#> ℹ Loading gridtext
devtools::load_all("~/packages/ggrepel")
#> ℹ Loading ggrepel
#> Loading required package: ggplot2

logo <- system.file("extdata", "Rlogo.png", package = "gridtext")

df <- data.frame(
  x = c(0, 4.9,5,5.1, 10),
  y = c(0, 4.9,5,5.1, 10),
  labels = c(
    "Some <span style='color:blue'>*italic blue*</span> text",
    "Other <span style='color:red'>**bold red**</span> text",
    paste0("<img src='", logo, "' width = '50'/>"),
    "Text with some <sup>superscript</sup>",
    "Drink enough H<sub>2</sub>O."
  )
)

ggplot(df, aes(x, y)) +
  geom_point() +
  geom_text_repel(
    aes(label = labels), 
    grob = richtext_grob
  )

To control additional arguments to that grob constructor, that don't have the plumbing to go through the parameters or aesthetics, you pass a list to grob_args.

ggplot(df, aes(x, y)) +
  geom_point() +
  geom_text_repel(
    aes(label = labels), 
    grob = richtext_grob,
    grob_args = list(
      padding = margin(10, 10, 10, 10),
      box_gp = gpar(fill = "#88888844", col = "grey"),
      r = unit(10, "pt")
    )
  )

Created on 2024-01-13 with reprex v2.0.2

To get this to work I had to make the following changes:

  1. Install the plumbing for grob and grob_args to reach the place where they're used and avoid argument clashes and mismatches.
  2. I had to change how the sizes of labels are measured. For rich text grobs, the labels can have a lot of additional html/markdown markup that count towards stringWidth() or stringHeight() of the label, but are not part of the actual size of the label. Now, the grobs themselves are measured instead of just their labels.

I've marked this PR as a draft because there is a caveat. You cannot use any grob constructor as the grob argument. Well, you probably can but that'd return errors or draw the labels incorrectly. The grob must have a decent xDetails() or yDetails() method to be accurately measured. Currently, {gridtext} does not have these methods, but I've offered to implement these in https://github.com/wilkelab/gridtext/issues/33. That is why I'm using my local copy of {gridtext} in the example, because that points to a local repo with those changes implemented. It'd probably make sense to only merge this PR if {gridtext} accepts the other PR.

teunbrand commented 9 months ago

To explain a bit more about the xDetails() / yDetails() thing; they are needed for grobX() and grobY() to return sensible values. If we make a grob builder that returns a grob that does not implement these methods, you get a crappy plot:

library(grid)
devtools::load_all("~/packages/ggrepel")
#> ℹ Loading ggrepel
#> Loading required package: ggplot2

custom_text <- function(label, x, y, hjust, vjust, gp = gpar(), 
                        default.units = "npc", ..., colours = rainbow(10)) {
  txt <- textGrob(
    label, x, y, hjust = hjust, vjust = vjust, 
    gp = gp, default.units = default.units
  )
  grob <- fillStrokeGrob(
    txt, gp = gpar(
      col = "black", lwd = 1,
      fill = linearGradient(colours)
    )
  )
  grob
}

ggplot(mtcars, aes(wt, mpg, label = rownames(mtcars))) +
  geom_point(colour = 'red') +
  expand_limits(x = c(1, 7), y = c(12, 24)) +
  geom_text_repel(grob = custom_text, max.overlaps = Inf, fontface = "bold")

Created on 2024-01-13 with reprex v2.0.2

But if you implement these methods correctly, that should again give sensible output:

xDetails.GridFillStroke <- function(x, theta) {
  xDetails(x$path, theta)
}
yDetails.GridFillStroke <- function(x, theta) {
  yDetails(x$path, theta)
}

last_plot()

Created on 2024-01-13 with reprex v2.0.2

slowkow commented 9 months ago

Thank you so much for writing this up! I like it a lot, and I'd be excited to eventually review/merge this someday when you let me know that gridtext is ready. (Sorry, but I have not yet read your code, so I'll study it later when I have more time.)

Could I ask about the magic behind xDetails.GridFillStroke? I'm not familiar with the magical naming conventions and I get confused because I don't know the full list names that will be searched for. I guess the function needs to be named in exactly this way because the object system is searching for a function with this specific name? Is there a nice reference (website or book) to get started learning about this naming convention? So many years later, I still I find grid to be mysterious.

An aside: Since I'm not an expert with grid, your comment regarding stringWidth() reminds me that there may be a possibility that the way I call convertWidth(x, "native", valueOnly = TRUE) in ggrepel is totally wrong. Maybe it's OK for now (since it seems to work well enough for many people), but I don't want my silly mistakes to prohibit developers like yourself from making cool new features like this PR.

teunbrand commented 9 months ago

Thanks for the response!

Could I ask about the magic behind xDetails.GridFillStroke?

It is just the S3 OOP convention that {function_name}.{class_name} is a method for the class_name class, so that xDetails.GridFillStroke is a xDetails method for the GridFillStroke class. It's used in a bunch of packages, not just grid. It is used right here in ggrepel even:

https://github.com/slowkow/ggrepel/blob/cb2de65ada832e67f258a509e73805a10d1d0ad3/R/geom-text-repel.R#L358

a possibility that the way I call convertWidth(x, "native", valueOnly = TRUE) in ggrepel is totally wrong.

I don't think it is wrong, at the time this is used the device dimensions are already known. There might be a small performance benefit to consistenly use absolute values that don't differ between the x and y-direction, but it shouldn't be huge.

slowkow commented 9 months ago

Thank you for the link to the S3 OOP book! This is exactly what I need.

For ggrepel I just copied what I saw in the source code for ggplot2, but I didn't fully understand how it works.

I was wondering where the name GridFillStroke comes from and I found my answer in this file:

https://svn.r-project.org/R/trunk/src/library/grid/R/path.R

It is defined in the grid package (which is included in base R). Specifically, we can see the definition in the file path.R by setting cl="GridFillStroke" in these functions:

fillStrokeGrob.grob <- function(x, rule=c("winding", "evenodd"),
                                name=NULL, gp=gpar(), vp=NULL, ...) {
    fillStroke <- grob(path=x, rule=match.arg(rule),
                       name=name, gp=gp, vp=vp, cl="GridFillStroke")
    fillStroke
}

fillStrokeGrob.GridPath <- function(x, name=NULL, vp=NULL, ...) {
    fillStroke <- grob(path=x$grob, rule=x$rule,
                       name=name, gp=x$gp, vp=vp, cl="GridFillStroke")
    fillStroke
}

Now that I understand where GridFillStroke is coming from, I have a better understanding of how your code works. This is great!