slowkow / ggrepel

:round_pushpin: Repel overlapping text labels away from each other in your ggplot2 figures.
https://ggrepel.slowkow.com
GNU General Public License v3.0
1.21k stars 95 forks source link

Vertical repulsion sensitive to label length and horizontal justification #224

Closed acgoodman closed 8 months ago

acgoodman commented 2 years ago

Summary

geom_*_repel() appears to be sensitive to label length and horizontal justification during vertical repulsion in narrow plots. This seems at first glance to be distinct from other justification issues.

Minimal code example

library(ggplot2)
library(ggrepel)

set.seed(100)

ggplot(mtcars, aes(mpg, disp))+
  geom_point()+
  geom_label_repel(
    aes(label = cyl), # label by cyl
    nudge_x = Inf, # put all labels on right
    direction = "y")+ # only adjust vertically
  scale_x_continuous(expand = expansion(mult=c(0,1))) # expand plot to the right

ggsave(filename="plot1.png", width=3, height=8, units = "in")
# short labels don't overlap with each other
ggplot(mtcars, aes(mpg, disp))+
  geom_point()+
  geom_label_repel(
    aes(label = row.names(mtcars)), # label by rowname
    nudge_x = Inf,
    direction = "y")+
  scale_x_continuous(expand = expansion(mult=c(0,1)))

ggsave(filename="plot2.png", width=3, height=8, units = "in")
# long labels DO overlap with each other
ggplot(mtcars, aes(mpg, disp))+
  geom_point()+
  geom_label_repel(
    aes(label = row.names(mtcars)),
    nudge_x = Inf,
    hjust = 1, # add hjust
    direction = "y")+
  scale_x_continuous(expand = expansion(mult=c(0,1)))

ggsave(filename="plot3.png", width=3, height=8, units = "in")
# right-justified labels don't overlap

Version information

Here is the output from sessionInfo() in my R session:

R version 4.1.1 (2021-08-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19044)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] ggrepel_0.9.1 ggplot2_3.3.6

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.8.3     magrittr_2.0.3   tidyselect_1.1.2 munsell_0.5.0    colorspace_2.0-3
 [6] R6_2.5.1         rlang_1.0.2      fansi_1.0.3      dplyr_1.0.8      tools_4.1.1     
[11] grid_4.1.1       gtable_0.3.0     utf8_1.2.2       cli_3.3.0        DBI_1.1.2       
[16] withr_2.5.0      ellipsis_0.3.2   digest_0.6.29    assertthat_0.2.1 tibble_3.1.7    
[21] lifecycle_1.0.1  crayon_1.5.1     farver_2.1.0     purrr_0.3.4      vctrs_0.4.1     
[26] glue_1.6.2       labeling_0.4.2   compiler_4.1.1   pillar_1.7.0     generics_0.1.2  
[31] scales_1.1.1     pkgconfig_2.0.3 

This is my first github issue! Hopefully there is enough information here.

slowkow commented 2 years ago

Thank you for the reprex!!!

aphalo commented 1 year ago

@slowkow @acgoodman I think the reason for the difference is not necessarily the length of the labels. It is possible that the numbers have vertically smaller bounding boxes than the capital letters in the same font, or the box padding or box size is influenced in some way by the text. The labels using capital letters do not seem fit in the available space with the default box padding. After decreasing the box.padding they barely fit into the available space.

I would not consider this to be a bug.

library(ggplot2)
library(ggrepel)

ggplot(mtcars[order(mtcars$disp), ], aes(mpg, disp))+
  geom_point()+
  geom_label_repel(
    aes(label = row.names(mtcars)), # label by rowname
    nudge_x = Inf,
    direction = "y",
    force = 0.5,
    force_pull = 0.1,
    box.padding = 0.06,
    max.time = 4,
    max.iter = 1e5)+
  scale_x_continuous(expand = expansion(mult=c(0.1,1)))

ggsave(filename="plot2a.png", width=4, height=8, units = "in")
# long labels NO LONGER overlap with each other

plot2a

Even without widening the plotting area, as above, the algorithm works within what is possible.

I have noticed in general is that if the space is very tight to avoid the labels "jumping around" one needs to decrease the forces and increase the number of iterations/time.

ggsave(filename="plot2a.png", width=3, height=8, units = "in")
# long labels NO LONGER overlap with each other

plot2a