tidyverse / ggplot2

An implementation of the Grammar of Graphics in R
https://ggplot2.tidyverse.org
Other
6.5k stars 2.02k forks source link

'ggplot2' position functions and 'ggrepel' #4468

Closed aphalo closed 2 years ago

aphalo commented 3 years ago

This is question related to inter-operation between packages and how to best achieve it. I have made a pull request for 'ggrepel' trying to easy this problem. During the exchange of ideas with @slowkow who is the author and maintainer of 'ggrepel' we thought best to ask here the opinion from the maintainers of 'ggplot2' before releasing some new features in 'ggrepel' and 'ggpmisc'.

Originally geom_text_repel() and geom_label_repel() did not have a position parameter and nudging was handled by a private version of position_nudge() currently exported as position_nudge_repel() . The repulsive geoms need as input both the nudged x and y positions and the original x and y positions to be able to draw a segment or arrow connecting them. The current naming used by position_nudge_repel() for the variables holding these positions is different to that used by 'ggplot2' position functions. My proposal as currently implemented in my pull request is to use x and y for the nudged positions for consistency with 'ggplot2' and in addition I propose calling the variables holding the original positions x_orig and y_orig. These last two names are the main concern of this issue.

Why I think it is important to agree on suitable names is that I have written some new position functions that I wish can be used both with the repulsive geoms from 'ggrepel' and geoms defined in 'ggplot2' and its extensions, even if in this case the original positions are ignored. In addition I think consistency in the naming of the variables used to store the original positions would easy the development of non repulsive geoms that draw connecting segments based only on nudging. From my tests using a more sophisticated approach to nudging than that available in 'ggplot2' can in some cases make repulsion more effective or even unnecessary.

I am not sure whether changing the position functions in 'ggplot2' so that they save the original x and y positions makes sense at this early stage, but agreeing on a naming convention that would be acceptable in the future could avoid compatibility headaches.

Preliminary versions of the new nudge functions are available in my 'ggpmisc' package, from which I show four example plots that use nudging (automatically computed) instead of repulsion to position text labels. (For this approach to work well with longer text labels also the justification needs to be computed in coordination with the nudging, but the implementation of the justification is still at at early stage.)

position_nudge_line-6

position_nudge_line-12

unnamed-chunk-77-1

unnamed-chunk-64-1

Thanks in advance for any ideas or suggestions.

clauswilke commented 3 years ago

Do you know if any of the current position adjustments in ggplot2 store the original coordinates? I believe they don't.

From my perspective, if ggrepel and ggpmisc define a standard that sounds good to me. But if you want this to be locked in at the ggplot2 level in some form then I think we'd need a PR that modifies appropriate ggplot2 positions and maybe also adds a simple geom that makes use of this info (maybe a type of segment geom?)

aphalo commented 3 years ago

Thanks for the fast answer! As far as I know none of the position adjustments in ggplot2 store the original coordinates. I will check to make sure.

Modifying the position functions to keep the original coordinates is trivial and a PR would be easy to make, but the question in my mind is if it is reasonable to increase the size of the returned object for all users. Maybe adding a parameter to enable and disable the saving of original positions would be the best approach.

A segment geom is a possibility, but then could we not diectly use "xend" and "yend" for the original coordinates? Something to think about...

clauswilke commented 3 years ago

A segment geom is a possibility, but then could we not diectly use "xend" and "yend" for the original coordinates? Something to think about...

I thought about it. I think it's unintuitive to call the original coordinates xend and yend.

clauswilke commented 3 years ago

Also, if it's meant to be a geom specifically to draw annotation segments, you could add additional functionality such as dropping segments that are too short.

aphalo commented 3 years ago

Do you have any preference between x_orig/y_orig and xorig/yorig?

thomasp85 commented 2 years ago

@aphalo is this issue still relevant here or have you moved ahead in ggrepel and ggmisc?

aphalo commented 2 years ago

@thomasp85 @slowkow I have implemented in ggpp (after splitting ggpmisc into two packages (ggpp and a slimer ggpmisc) some position functions that return the original positions before nudging in two additional columns: data$x_orig and data$y_orig. These functions are rather experimental but already avaialble in the version of ggpp now in CRAN. The original positions are needed whenever a connecting segments or arrow to the original positions is to be plotted. In this sense it is a general problem for which an agreed upon name would be useful. I see little recent activity in GitHub for ggrepel. I made a pull request in May but it has not been merged. I should probably have another look at it just in case.

hadley commented 2 years ago

Without strong demand across multiple, I think there's little benefit for this code to live in ggplot2, and it's easier to maintain if it lives in a package with a more frequent release cycle.

aphalo commented 2 years ago

@hadley o.k., thanks! I agree. I'll keep it in my package 'ggpp', and compatible with 'ggrepel' geoms.