slowkow / ggrepel

:round_pushpin: Repel overlapping text labels away from each other in your ggplot2 figures.
https://ggrepel.slowkow.com
GNU General Public License v3.0
1.21k stars 95 forks source link

Feature request: Ignore max.overlaps for specific points #230

Closed mschubert closed 8 months ago

mschubert commented 1 year ago

It happens to me fairly often that I have a cloud of points, for which I want to label

It would be nice if ggrepel could support drawing labels within a region that surpasses max.overlaps, but are of particular interest. A way to implement this could be to pass an additional argument that specifies which points to draw, irrespective of whether they reside in a dense region.

aphalo commented 1 year ago

Do you mean to manually prioritize some labels over others when they cannot all be plotted without overlaps?

mschubert commented 1 year ago

Yes, that's another way of putting it

slowkow commented 1 year ago

Could I please ask if you might consider sharing a minimal code example to demonstrate the issue?

Also, you might like to explore the functions in the ggpp package. Maybe they are helpful in your situation?

mschubert commented 1 year ago

Sure! So let's say I have a point cloud (grey), for which I want to label the outliers and specific points (black):

library(ggplot2)
syms = c(letters, LETTERS, 0:9)
labs = do.call(paste0, expand.grid(syms, syms))
dset = data.frame(x=rnorm(1e4), y=rnorm(1e4), label=sample(labs, 1e4, replace=TRUE))
ggplot(dset, aes(x=x, y=y)) +
    geom_point(aes(color=label %in% c("aA", "bB", "cC"))) +
    scale_color_manual(values=c("TRUE"="black", "FALSE"="grey")) +
    ggrepel::geom_text_repel(aes(label=label), max.overlaps=3)

image

I would like to label the outliers as they are now and in addition label the black points. And do this is one call to geom_text_repel, because otherwise some labels may be nudged into the others.

Thanks for pointing me to ggpp. I see that stat_dens2d_filter does something similar, and perhaps it would be better suited there (but I don't think it's currently able to achieve this).

slowkow commented 1 year ago

OK, I think I might understand your request.

  1. Would you consider a workaround with 2 calls to geom_text_repel() (one for the grey dots and one for the black dots)?

  2. Could I ask if you have a suggestion for how to implement your desired feature?

geom_text_repel(aes(label = label), max.overlaps = ???)

For example, would you want the value passed to max.overlaps to be a vector of numbers (one for each label) instead of a single number?

aphalo commented 1 year ago

I think implementing a way of doing this in stat_dens2d_filter() would be easy, but an interface that is consistent with the grammar of graphics needs some thought. I would think a formal parameter protect taking a vector that can be used as indexes (or subscripts) either logical or integer could work, but would be atypical for the grammar of graphics. So a function that does a test based on the label text, passed as argument to protect would be best I think. Any suggestions or thoughts?

aphalo commented 1 year ago

I think the best approach would accepting a function or a character vector of labels. So, if we want to protect say very few labels we would pass a vector of label texts, in other cases a user defined function using grepl() or grep() could be passed. However, in your example you seem to be willing for labels to overlap dots, and this would be certainly the case for the protected labels, you would anyway need multiple layers in the figure, one for all the points, (one for the labelled points if you want to highlight them) and one for the labels. (In your example some of the black dots are occluded by the grey ones and not visible.)

I will try to implement something like this in the next version of 'ggpp', as it seems generally useful.

mschubert commented 1 year ago

Thanks a lot for your answers!

Would you consider a workaround with 2 calls to geom_text_repel() (one for the grey dots and one for the black dots)?

That's what I did in the past, but sometimes the new labels will be pushed over the old labels. So this is unfortunately not a good solution.

Could I ask if you have a suggestion for how to implement your desired feature?

I could see (1) the vector of max.overlaps that you are suggesting, or (2) an additional argument for which points to ignore the overlaps (e.g. ignore.overlaps, which may be a logical vector or a function).

I think implementing a way of doing this in stat_dens2d_filter() would be easy

I'm starting to lean towards addressing this in stat_dens2d_filter() because the function is already applying a geom to a subset of the data, which is the same class of problem that my use case is about.

aphalo commented 1 year ago

@mschubert @slowkow With future 'ggpp' 0.5.1 or the current GitHub version of 'ggpp', the plot could be created as shown below. In this case the second example is the simplest, but in a function one can use grep() or grepl(). Not exemplified is the use of numeric or logical vectors as arguments to keep.these.

library(ggplot2)
library(ggpp)
#> 
#> Attaching package: 'ggpp'
#> The following object is masked from 'package:ggplot2':
#> 
#>     annotate
library(ggrepel)
syms = c(letters, LETTERS, 0:9)
labs = do.call(paste0, expand.grid(syms, syms))
dset = data.frame(x=rnorm(1e4), y=rnorm(1e4), label=sample(labs, 1e4, replace=TRUE))
ggplot(dset, aes(x=x, y=y, label = label)) +
  geom_point(colour = "grey") +
  stat_dens2d_filter(geom = "text_repel",
                     position = position_nudge_centre(x = 0.1, y = 0.1, direction = "radial"),
                     keep.number = 50,
                     keep.these = function(x) {x %in% c("aA", "bB", "cC")},
                     min.segment.length = 0) +
  theme_bw()


ggplot(dset, aes(x=x, y=y, label = label)) +
  geom_point(colour = "grey") +
  stat_dens2d_filter(geom = "text_repel",
                     position = position_nudge_centre(x = 0.1, y = 0.1, direction = "radial"),
                     keep.number = 50,
                     keep.these = c("aA", "bB", "cC"),
                     min.segment.length = 0) +
  theme_bw()

Created on 2023-01-20 with reprex v2.0.2

slowkow commented 1 year ago

@aphalo Amazing!

aphalo commented 1 year ago

@slowkow Thanks! The code in 'ggpp' is dead simple compared to the repulsion code in 'ggrepel' but it does seem to help quite a lot in some cases.