slowkow / ggrepel

:round_pushpin: Repel overlapping text labels away from each other in your ggplot2 figures.
https://ggrepel.slowkow.com
GNU General Public License v3.0
1.22k stars 96 forks source link

Can you extract ggrepel coordinates? #24

Closed zachcp closed 8 years ago

zachcp commented 8 years ago

Hi @slowkow ,

Great package. I am wondering if there is a way to get back the pre-transformed coordinate locations for ggrepel text points. For example, once plotted, it is possible to obtain the ggplot values of there the points are but it would be very helpful/useful for hacking purposes to expose those values somehow.

This should give an idea of what I am hoping to achieve. I don't know if it is possible to get his data any other way besides plotting and pulling out the post-processed values.

library(ggrepel)
library(gtable)

gg <- ggplot(mtcars, aes(wt, mpg)) +
  geom_point(color = 'red') +
  geom_text_repel(aes(label = rownames(mtcars))) 

#pull out the ggrepel grob bits 
built <- ggplot_gtable(ggplot_build(gg))        
plotdata <- built$grobs[[4]]$children$geom_text_repel.textrepeltree.53$data

# plot data will have the x and y values where ggrepel is plotted but they
# are expressed as a range between 0 and 1.
# can we get the values expressed in terms of the original data?

Thanks,

zach cp

zachcp commented 8 years ago

update: looks the the $data I was extracting in the example is the un-repelled locations. To get an idea of what I am playing with see the image below. I was going to use ggrepel to draw lines and then add grobs on top of the repelled locations. Once you have those locations you can go back and overlay whatever you want - images, grobs etc. In my case I was trying out some piecharts.

If you can get untransformed coordinates of the final text locations you can use ggplot directly to add the grobs. If you only have the final, panel-based coordinates which is what I was working with, you can do a trick like use cowplot::ggdraw() and then overlay your images using the coordinates. I could make the text go away by playing with the size or alpha settings.

Unfortunately the coordinates I thought were the repelled coords are actually just the original coordinates in panel-space. I'd love to have access to the modified, repelled locations....

screen shot 2016-02-26 at 5 06 56 pm
slowkow commented 8 years ago

Thanks for describing this interesting use case. I think we might have to iterate a few times to get this right.

As as first step, could I ask you to please try using this code to get at the repelled coordinates? Please let me know how it goes, and then we can consider some possible ways forward to make this easier in the future.

library(ggrepel)
library(grid)

g <- ggplot(mtcars, aes(wt, mpg)) +
  geom_point(color = 'red') +
  geom_text_repel(aes(label = rownames(mtcars))) 

grid.force()

kids <- childNames(grid.get("textrepeltree", grep = TRUE))

d <- do.call(rbind, lapply(kids, function(n) {
  x <- grid.get(n)
  data.frame(
    x = convertX(x$x, "native", valueOnly = TRUE),
    y = convertY(x$y, "native", valueOnly = TRUE),
    x.orig = convertX(x$x.orig, "native", valueOnly = TRUE),
    y.orig = convertY(x$y.orig, "native", valueOnly = TRUE)
  )
}))

plot(d$x.orig, d$y.orig)
points(d$x, d$y)
segments(d$x, d$y, d$x.orig, d$y.orig)

image

zachcp commented 8 years ago

Hi Kamil,

Thanks for your excellent package and your quick reply. I will give this a go although, after sleeping on the problem, I think the easier/cleaner way is the following:

  1. Call your repel_boxes function directly with the input data
  2. Obtain the repelled coords.
  3. only then invoke ggplot and use geom_point for the original points, geom_segment for the arrows and annotation_custom to place plots/images.

I will try to mess around with this a little later today.

zach cp


From: Kamil Slowikowski notifications@github.com Sent: Sunday, February 28, 2016 1:19 PM To: slowkow/ggrepel Cc: Zachary Charlop-Powers Subject: Re: [ggrepel] Can you extract ggrepel coordinates? (#24)

Thanks for describing this interesting use case. I think we might have to iterate a few times to get this right.

As as first step, could I ask you to please try using this code to get at the repelled coordinates? Please let me know how it goes, and then we can consider some possible ways forward to make this easier in the future.

library(ggrepel) library(grid)

g <- ggplot(mtcars, aes(wt, mpg)) + geom_point(color = 'red') + geom_text_repel(aes(label = rownames(mtcars)))

grid.force()

kids <- childNames(grid.get("textrepeltree", grep = TRUE))

d <- do.call(rbind, lapply(kids, function(n) { x <- grid.get(n) data.frame( x = convertX(x$x, "native", valueOnly = TRUE), y = convertY(x$y, "native", valueOnly = TRUE), x.orig = convertX(x$x.orig, "native", valueOnly = TRUE), y.orig = convertY(x$y.orig, "native", valueOnly = TRUE) ) }))

plot(d$x.orig, d$y.orig) points(d$x, d$y) segments(d$x, d$y, d$x.orig, d$y.orig)

[image]https://cloud.githubusercontent.com/assets/209714/13380978/b8098756-de1d-11e5-8a64-a21b1538f543.png

Reply to this email directly or view it on GitHubhttps://github.com/slowkow/ggrepel/issues/24#issuecomment-189917094.

zachcp commented 8 years ago

Hi Kamil,

I came up with the following solution which could be extended. It was straightforward to make in part because your package was so well documented (thanks.) I think the main advantage here is that you can calculate the coordinates using Rcpp and only need to use ggplot at the end and only once. It will need some tweaking for use on real plots etc, but I think its a good start.

zach cp

#' Given a Set of Points and Box sizes, find locations
#'
#'
findboxes <- function(df, xcol, ycol, pad_point_x, pad_point_y, xlim, ylim,
                      force = 1e-6, maxiter = 20000) {

  #x and y posiitons as a dataframe
  posdf <- df[c(xcol,ycol)] 

  #returnd a df where columns are points
  boxdf <- apply(posdf,1,function(row) { xval <- row[xcol]
                                         yval <- row[ycol]
                                        return(c(xval, 
                                                 yval, 
                                                 xval + pad_point_x, 
                                                 yval + pad_point_y))})                                       
  # columns are x1,y1,x2,y2
  boxmatrix = as.matrix(t(boxdf))

  moved <- ggrepel:::repel_boxes(data_points=as.matrix(posdf), 
                                 pad_point_x=0.1, 
                                 pad_point_y=0.1, 
                                 boxes = boxmatrix,
                                 xlim=xlim,
                                 ylim=ylim,
                                 force=force,
                                 maxiter=maxiter)

  finaldf <- cbind(posdf, moved)
  names(finaldf) <- c("x1","y1","x2","y2")
  return(finaldf)
}

df1 <- findboxes(mtcars, xcol = 'wt', ycol='mpg',
                 pad_point_x = 0.5,pad_point_y = 0.5,
                 xlim = c(0,8), ylim=c(10,35))

ggplot(df1) +
  geom_segment(aes(x=x1,y=y1,xend=x2,yend=y2)) +
  geom_point(aes(x1,y1), color="black") +
  geom_point(aes(x2,y2), color="red") 
screen shot 2016-02-29 at 2 23 59 pm
slowkow commented 8 years ago

That's excellent! Thanks for sharing your nice code snippet.

I'm very interested to see the pie chart plot you mentioned before. Could I ask you to please share a code snippet and an example plot? I think that's a really cool usage example, so I might like to feature it in the vignette.

zachcp commented 8 years ago

Sure, I'll try to work it up.

zachcp commented 8 years ago

@slowkow,

I put up a reproducible example here http://rpubs.com/zachcp/157645 with the gist here https://gist.github.com/zachcp/f2429fc17cf6c59d0967.

There are still some issues that need to be worked out in regards to getting a good spread. The major problem I saw on the ggrepel side is the ability to escape the maxx and maxy values (see below)

screen shot 2016-03-01 at 1 23 08 pm

screen shot 2016-03-01 at 1 24 52 pm

thanks, zach cp

joelgombin commented 8 years ago

In line with this request, do you think the algorithm you use could also be used in a non-ggplot context ? I'm thinking for example about using it for labelling (non-ggmap) maps. Maybe @mtennekes would be interested in implementing it in tmap, for example.

I guess ideally that would mean packaging repel_boxes out of ggrepel, so that one can use without having a ggplot2 dependency.

slowkow commented 8 years ago

@joelgombin Of course, the algorithm could be used outside of ggplot2, and I encourage anyone interested to please take the code and use it!

Since I believe that ggplot2 is the most popular way to create plots in R, I don't mind having it as a dependency and do not plan to create a new package without the dependency.

I think the findboxes() function in the RPub by @zachcp is a great start to making a public interface for the layout algorithm. I'll try to take inspiration from this code to expose a public function for ggrepel users so it's easy to make use of the layout algorithm without using the geom_text_repel geom.

rdisalv2 commented 7 years ago

Hi,

Thanks for ggrepel. Question about getting the locations of the labels from the ggplot object after it is rendered.

The code from @slowkow, grid.get("textrepeltree", grep = TRUE) seems to return NULL when I run it on the current version of ggrepel. I wasn't able to figure out something else to pass to grid.get. Is there a way to modify that code to get the points directly out of ggrepel in the current version?

I tried the alternative approach by @zachcp going directly to findboxes but I couldn't replicate the very high-quality plots I can get directly through geom_text_repel().

Thanks!

slowkow commented 7 years ago

I think you need to run grid.force() before grid.get(...).

rdisalv2 commented 7 years ago

Hmmm... ok it works if I render g before doing grid.force:

library(ggrepel)
library(grid)

g <- ggplot(mtcars, aes(wt, mpg)) +
  geom_point(color = 'red') +
  geom_text_repel(aes(label = rownames(mtcars))) 
g
grid.force()

without that extra g there it didn't seem to work -- guess I've revealed I don't understand much ggplot

Thanks for your help!

slowkow commented 6 years ago

Matthias Grenié shared via a tweet:

Hi! Thanks @slowkow for building #ggrepel #rstats📦! It's amazing. sorry to bother on twitter but I'm struggling with it to access repel coordinates to reuse them. The code from https://github.com/slowkow/ggrepel/issues/24 … does not seem to work when using complex options (such as nudge_y, etc.)

Just so you know, I'm trying to get repelled labels coordinates to add geom_image from @guangchuangyu ggimage📦 to add pictures near labels... Not sure it's the best way to proceed...

Hi Matthias! Next time, you might consider commenting on the issue or making a new issue. It's easier for me to keep track and provide a response with code or images, and others have a chance to find our discussion, too. I get a notification either way, whether you post on twitter or github.

I think the private, non-exported code in the repel_boxes() Rcpp function changed sometime after this issue was posted, so that is probably why the findboxes() function stopped working.

Below, I updated the findboxes() function to work with ggrepel 0.8.0.

Please note that this is not officially supported by ggrepel, because the repel_boxes() function is not exported. (This is why we have to use 3 colons ::: to access it.)

I'm still not sure if I want to provide support for an exported repel_boxes() function.

For me, the example below doesn't work very well, and I'm not sure why. I tried to modify the padding and I never got a result that looks good to me. Some of the images are never pushed away from their corresponding points, even when it seems they should be. If you find a way to improve upon this, please share!

library(ggrepel)
library(ggimage)

#' Given a Set of Points and Box sizes, find locations
#' Written by @zachp, updated by @slowkow
findboxes <- function(
  df, xcol, ycol,
  box_padding_x, box_padding_y,
  point_padding_x, point_padding_y,
  xlim, ylim,
  force = 1e-7, maxiter = 20000
) {

  # x and y posiitons as a dataframe
  posdf <- df[c(xcol, ycol)]

  # returnd a df where columns are points
  boxdf <- apply(posdf, 1, function(row) {
    xval <- row[xcol]
    yval <- row[ycol]
    return(c(
      xval - box_padding_x / 2,
      yval - box_padding_y / 2,
      xval + box_padding_x / 2,
      yval + box_padding_y / 2
    ))
  })
  # columns are x1,y1,x2,y2
  boxmatrix <- as.matrix(t(boxdf))

  moved <- ggrepel:::repel_boxes(
    data_points = as.matrix(posdf),
    point_padding_x = point_padding_x,
    point_padding_y = point_padding_y,
    boxes = boxmatrix,
    xlim = xlim,
    ylim = ylim,
    hjust = 0.5,
    vjust = 0.5,
    force = force,
    maxiter = maxiter
  )

  finaldf <- cbind(posdf, moved)
  names(finaldf) <- c("x1", "y1", "x2", "y2")
  return(finaldf)
}

df1 <- findboxes(mtcars,
  xcol = "wt", ycol = "mpg",
  box_padding_x = Reduce("-", rev(range(mtcars$mpg))) * 0.08,
  box_padding_y = Reduce("-", rev(range(mtcars$wt))) * 0.05,
  point_padding_x = Reduce("-", rev(range(mtcars$wt))) * 0.08,
  point_padding_y = Reduce("-", rev(range(mtcars$mpg))) * 0.05,
  force = 1e-6,
  xlim = c(0, 8),
  ylim = c(10, 35)
)

ggplot(df1) +
  geom_segment(aes(x = x1, y = y1, xend = x2, yend = y2)) +
  geom_point(aes(x1, y1), color = "black") +
  # geom_point(aes(x2, y2), color = "red") +
  geom_image(aes(x2, y2), image = "https://www.r-project.org/logo/Rlogo.png")
  NULL

ggimage

McAllister-NOAA commented 4 years ago

A very useful function!! I hope you might consider supporting it in future version of ggrepel. I used this function to find the data points for pie charts using scatterpie (doesn't support repelling natively). Thanks!!

AlvaroMCMC commented 4 years ago

@slowkow is there a way to get the coordinates in UTM WGS (projected coordinates), because I am plotting grobs created by annotate_custom from ggplot2, and these are created with x and y coordinates and I want them repelled like labels. Tah

library(ggrepel)
library(grid)

g <- ggplot(mtcars, aes(wt, mpg)) +
  geom_point(color = 'red') +
  geom_text_repel(aes(label = rownames(mtcars))) 

grid.force()

kids <- childNames(grid.get("textrepeltree", grep = TRUE))

d <- do.call(rbind, lapply(kids, function(n) {
  x <- grid.get(n)
  data.frame(
    x = convertX(x$x, "native", valueOnly = TRUE),
    y = convertY(x$y, "native", valueOnly = TRUE),
    x.orig = convertX(x$x.orig, "native", valueOnly = TRUE),
    y.orig = convertY(x$y.orig, "native", valueOnly = TRUE)
  )
}))

plot(d$x.orig, d$y.orig)
points(d$x, d$y)
segments(d$x, d$y, d$x.orig, d$y.orig)
slowkow commented 4 years ago

@AlvaroMCMC You might consider trying the findboxes() function listed above, and please feel free to copy and modify it.

@McAllister-NOAA If you have any ideas for modifying the findboxes() function so it works better, please share! Thanks.

pstraforelli commented 3 years ago

Hello,

It seems that ggrepel:::repel_boxes() is no longer available as of version 0.9.1, breaking the above code. I notice that there is now a ggrepel:::repel_boxes2() but it's not a simple swap. Is there an alternative method?

alexvpickering commented 3 years ago

I forked ggrepel to get extracted coordinates. It workes well enough for me (not as good): https://github.com/hms-dbmi/repel

Thanks for the great packages @slowkow!

alephreish commented 2 years ago

Given the lack of documentation for repel_boxes2, as a user searching for a quick solution for placing geometries at repelled positions with ggrepel 0.9.1 using packages available via conda, I found this https://stackoverflow.com/a/45067419/2832716 answer, similar to https://github.com/slowkow/ggrepel/issues/24#issuecomment-189917094, most adaptable. My function looks as follows:

library(grid)
library(ggrepel) # 0.9.1
library(ggplot2) # 3.3.5
library(tidyr)
library(dplyr)

get_repel_coords <- function(.data, g_base, width, height, ...) {
    grid.newpage()
    pushViewport(viewport(width = width, height = height))
    g <- g_base +
        geom_text_repel(aes(x, y), label = ".", data = .data, max.overlaps = Inf, ...)
    panel_params <- ggplot_build(g)$layout$panel_params[[1]]
    xrg <- panel_params$x.range
    yrg <- panel_params$y.range

    textrepeltree <- ggplotGrob(g) %>%
        grid.force(draw = F) %>%
        getGrob("textrepeltree", grep = T)
    children <- childNames(textrepeltree) %>%
        grep("textrepelgrob", ., value = T)

    get_xy <- function(n) {
        grob <- getGrob(textrepeltree, n)
        data.frame(
            x.repel = xrg[1] + diff(xrg) * convertX(grob$x, "native", valueOnly = T),
            y.repel = yrg[1] + diff(yrg) * convertY(grob$y, "native", valueOnly = T)
        )
    }
    lapply(children, get_xy) %>%
        bind_rows %>%
        cbind(.data)
}

This relies on textrepelgrob having the same order as the input rows which is the case de facto, although not promised explicitly.