slowkow / ggrepel

:round_pushpin: Repel overlapping text labels away from each other in your ggplot2 figures.
https://ggrepel.slowkow.com
GNU General Public License v3.0
1.21k stars 95 forks source link

ggrepel accepts negative values for force_pull, leading to undesirable results #220

Closed Blundys closed 2 years ago

Blundys commented 2 years ago

Hi I noticed that the force_pull feature is incredible sensitive to the plot resolution. This appears to happen at one single plot size change which dramatically changing the label position, rather than a gentle transition as you might expect.

If I set force_pull slightly negative in order to space the labels out a little bit it looks great on screen in Rstudio and when I first saved it. Then I just wanted to adjust plot size slightly, and labels are pushed to the extremes of the plot. Once I noticed this I also tried changing the plot window size in Rstudio and it stays pretty much the same to a point and then experiences same dramatic change with very little change in plot size at a single point.

The size where this dramatic change occurs seems to be dependent on the seed. In my example below if I set seed at 99 both plots have labels pushed to the margins and only once I set width and height = 10 do they come in close to the points.

library(ggplot2)
library(ggrepel)
set.seed(42)

dat <- mtcars
dat$car <- rownames(dat)

p <- ggplot(dat, aes(wt, mpg, label = car)) +
  geom_point(color = "red")

p2 <- p + geom_text_repel(force_pull = -0.1, seed = 1) + labs(title = "geom_text_repel()")

ggsave(p2, filename = "C:/temp/test.png", width = 8,height = 8)

test


ggsave(p2, filename = "C:/temp/test1.png", width = 7,height = 7)

test1

sessionInfo()
#> R version 4.0.5 (2021-03-31)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 19044)
#> 
#> Matrix products: default
#> 
#> locale:
#> [1] LC_COLLATE=English_Australia.1252  LC_CTYPE=English_Australia.1252   
#> [3] LC_MONETARY=English_Australia.1252 LC_NUMERIC=C                      
#> [5] LC_TIME=English_Australia.1252    
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] ggrepel_0.9.1 ggplot2_3.3.5
#> 
#> loaded via a namespace (and not attached):
#>  [1] Rcpp_1.0.7       compiler_4.0.5   pillar_1.6.2     highr_0.9       
#>  [5] tools_4.0.5      digest_0.6.27    evaluate_0.14    lifecycle_1.0.0 
#>  [9] tibble_3.1.5     gtable_0.3.0     pkgconfig_2.0.3  rlang_0.4.11    
#> [13] reprex_2.0.1     DBI_1.1.1        cli_3.0.1        rstudioapi_0.13 
#> [17] yaml_2.2.1       xfun_0.25        fastmap_1.1.0    withr_2.4.2     
#> [21] stringr_1.4.0    dplyr_1.0.7      knitr_1.33       generics_0.1.0  
#> [25] fs_1.5.0         vctrs_0.3.8      tidyselect_1.1.1 grid_4.0.5      
#> [29] glue_1.4.2       R6_2.5.1         fansi_0.5.0      rmarkdown_2.10  
#> [33] farver_2.1.0     purrr_0.3.4      magrittr_2.0.1   scales_1.1.1    
#> [37] htmltools_0.5.2  ellipsis_0.3.2   assertthat_0.2.1 colorspace_2.0-2
#> [41] labeling_0.4.2   utf8_1.2.2       stringi_1.7.4    munsell_0.5.0   
#> [45] crayon_1.4.1

Created on 2022-04-29 by the reprex package (v2.0.1)

slowkow commented 2 years ago

If I set force_pull slightly negative in order to space the labels out a little bit

Sorry, but ggrepel should throw an error when force_pull is less than zero. That is not a valid input for the repulsion algorithm.

The algorithm uses two forces:

If we set the pull force to zero, then labels are free to drift anywhere along the plot.

If we set the pull force to negative, then labels are actively pushed away from their respective points.

I hope this helps you to understand why we should not be setting the pull force to a negative value.

Please see if you can achieve your desired result with other available options:

Pull requests to improve the ggrepel examples are very welcome!

Blundys commented 2 years ago

Interesting, I certainly don't get an error message when I set it to negative. I saw very little difference between the default and force_pull = 0 which is why I set it negative. I was hoping to deliberately push the labels away from the points, my real plot has lines behind the points I am labelling so I wanted to move the points further away so the lines could be seen and then the labels connected with the segment lines. Because of the shape of the data forcing a direction with nudge didn't work (I wanted it to move away from the cluster of points, not in a specific direction). Also point padding seems to cut the segment line off further from the point so using this to push the labels away also led to ambiguity it what they are labelling.

Anyway I was able to achieve what I was hoping for using the negative force pull, I just need to review the resolution for each plot to check . Thanks