AllanCameron / geomtextpath

Create curved text paths in ggplot2
https://allancameron.github.io/geomtextpath
Other
625 stars 24 forks source link

Enable hjust to place text proportionally along a line within the plot bounds #60

Closed jwhendy closed 2 years ago

jwhendy commented 2 years ago

Greetings,

Awesome package! I just was introduced to it via my StackOverflow question here, and the answer-er ran into issues with reproducing my desired result given the hjust option. This bit from the manual is very enticing, though in the example it didn't seem entirely accurate?

Although such lines aren’t curved, there are some benefits to using the geomtextpath functions if a labelled reference line is required: only a single call is needed, co-ordinates are not required for the text label, ...

Mainly, while coordinates aren't required, it does seem that tailoring hjust to the specific line is still necessary. I'll re-create the example here.

base plot

library(ggplot2)

set.seed(123)
df <- data.frame(
  x = runif(100, 0, 1),
  y = runif(100, 0, 1))

lines <- data.frame(
  intercept = rep(0, 5),
  slope = c(0.1, 0.25, 0.5, 1, 2))

p <- ggplot(df, aes(x = x, y = y)) +
  geom_point() +
  geom_abline(aes(intercept = intercept, slope = slope),
              linetype = "dashed", data = lines)

grab_2022-01-18_083517

manual labeling (trial and error to construct the data frame of x, y, and labels

This would be the ideal output of some automated solution, placing the labels nicely off to the top/side.

labels <- data.frame(
  x = c(rep(1, 3), 0.95, 0.47),
  y = c(0.12, 0.28, 0.53, 1, 1),
  label = lines$slope)

p + geom_text(aes(label = label), color = "red", data = labels)

grab_2022-01-18_083626

solution from camille using geom_textabline

library(geomtextpath)
p + geom_textabline(aes(intercept = intercept, 
                        slope = slope,
                        label = as.character(slope)),
                    data = lines,
                    gap = FALSE,
                    offset = unit(0.2, "lines"),
                    text_only = TRUE)

grab_2022-01-18_083844

The hjust argument isn't doing exactly what I would expect given typical ggplot2 meaning (left, center, right justification). I understand from camille's answer that 0.5 is the default value, however from trial and error starting at 0 and incrementing by 0.1, I found that 0.4 is the first time they show up:

p + geom_textabline(aes(intercept = intercept, 
                        slope = slope,
                        label = as.character(slope)),
                    data = lines,
                    gap = FALSE,
                    offset = unit(0.2, "lines"),
                    text_only = TRUE,
                    hjust = 0.4)

grab_2022-01-18_084200

And by 0.6, the top label is already off the plot area, with the rest following at 0.7.

p + geom_textabline(aes(intercept = intercept, 
                        slope = slope,
                        label = as.character(slope)),
                    data = lines,
                    gap = FALSE,
                    offset = unit(0.2, "lines"),
                    text_only = TRUE,
                    hjust = 0.6)

grab_2022-01-18_084213

27 and #34 seem related to this, but for contours. Basically giving a sort of "auto-placement", though in this case it seems a lot more straightforward, just figuring out the min/max range of the line vs. having to figure out where a contour is "flattest".

I'll include her suggestions for "manual" placement, either from the data itself, or by using the plot object internals:

# y = intercept + slope * x
xmax <- max(df$x) 
# or layer_scales(p)$x$get_limits()[2] for data range
# or ggplot_build(p)$layout$panel_params[[1]]$y.range[2] for panel range
ymax <- max(df$y)
lines_calc <- lines %>%
  mutate(xcalc = pmin((ymax - intercept) / slope, xmax),
         ycalc = pmin(intercept + slope * xmax, ymax))

p +
  geom_text(aes(x = xcalc, y = ycalc, label = as.character(slope)),
            data = lines_calc, vjust = 0, nudge_y = 0.02)

I've honestly never programmed "under the hood" in an R package, but would be willing to try if you think there is merit/feasibility to any of these approaches? Or if there was a suggested couple of files to look in for something like the $get_limits trick that might be used for this package, I could take those as a starting point?

Let me know what you think, and thanks for your consideration/thoughs!

teunbrand commented 2 years ago

I don't think the auto-placement rules is at the heart of the issue. I think one reason that there is a discrepancy in how hjust behaves and what our intuition is, is because of these lines below that expand the x-range:

https://github.com/AllanCameron/geomtextpath/blob/9f90609048f77020b7f5138ef147bd6b1b5daa59/R/geom_textabline.R#L284-L288

A second reason is because the y-values for the slope = 2 are out-of-bounds, creating a longer 'arc'-length than what somebody perceives when the panel is clipped.

I don't really see a way around this problem besides manually clipping (calculate intersection points with the axes). The proper place for this is the Geom{Text/Label}abline ggproto classes, without any adjustments more downstream. This is essentially incompatible with the line ending trick mentioned above. Alan, do you think that fixing the hjust is more important than having nice line-endings?

As for the ideal solution to your problem, I don't think placing the labels at 0-degree angles at the edges is within the scope of this function: placing a label along the line would be.

jwhendy commented 2 years ago

I don't think the auto-placement rules is at the heart of the issue.

I think my intuitive use of "auto-place" vs. what's meant in this project (e.g. for contours) is confusing. Sorry about that. I just meant "given a line, auto-place the labels proportionally along it without me having to figure out the positions via trial and error manually."

I can change the title if it's helpful? "Enable placement of text proportionally along a line within the plot bounds using hjust"?

This is essentially incompatible with the line ending trick mentioned above. Alan, do you think that fixing the hjust is more important than having nice line-endings?

Forgive my guessing given new status to the library, but would a third option to be splitting up the handling for lines vs. labels? I noticed that the chunk at L176 is repeated at L284... do these have to be identical?

teunbrand commented 2 years ago

"given a line, auto-place the labels proportionally along it without me having to figure out the positions via trial and error manually."

Yes, I agree, that is exactly what I think it should do, but at the moment doesn't happen for the abline variant.

Forgive my guessing given new status to the library, but would a third option to be splitting up the handling for lines vs. labels? I noticed that the chunk at L176 is repeated at L284... do these have to be identical?

There is some redundancy between code because we have a plain text and label variant (with textbox) that are very similar. We have to keep labels and lines together because we allow the gap argument to break up the line if it appears to intersect with the text.

AllanCameron commented 2 years ago

Hi John - thanks for writing. You give a clear demonstration of the problem, and it was one I was aware of when I was writing the geom_textabline function - as Teun says, it is fairly clear where the problem lies: the reference lines are all extended way off the plotting area to ensure that ugly line ends aren't visible. This means that for the reference line functions, the hjust currently needs to be tweaked by the user on a per-line basis. This is pretty easy to do using the scale_hjust_manual function, which was included specifically for this purpose.

p + geom_textabline(aes(intercept = intercept, 
                        slope = slope,
                        label = as.character(slope),
                        hjust = as.character(slope)),
                    data = lines,
                    gap = FALSE,
                    offset = unit(0.2, "lines"),
                    text_only = TRUE,
                    color = "red") +
  scale_hjust_manual(values = c(0.65, 0.65, 0.65, 0.65, 0.5))

However, the problem is that the hjust values we need to supply are a bit off. The ends of the bottom four on-screen lines effectively map to 0.33 - 0.67 instead of 0 - 1 because of the line extension code that Teun pointed out, and the ends of the top line effectively maps from 0.33 - 0.5 because it doesn't reach the right edge of the screen.

@teun - I don't think we need to shorten the lines to get the hjust working more intuitively - I think we can calculate the visible portion of the line inside the draw_panel function and remap the hjust onto it.

jwhendy commented 2 years ago

@teunbrand

I tweaked the title to hopefully be more accurate.

There is some redundancy between code because we have a plain text and label variant (with textbox) that are very similar. We have to keep labels and lines together because we allow the gap argument to break up the line if it appears to intersect with the text.

Bah, sorry, I hadn't parsed the meaning of the two functions in R/geom_textabline. I thought one was creating just the line, specifically, and the other was applying the text (my interpretation of "label"). If this were the case, it seemed that the extremes of the line and text could be different. Now I get that these are just two variants.

@AllanCameron

I don't think we need to shorten the lines to get the hjust working more intuitively - I think we can calculate the visible portion of the line inside the draw_panel function and remap the hjust onto it.

This was my hope, but wasn't sure where one would start. Basically, keep using your expanded lines trick while also "smartly" calculating the true bounds (and thus fitting to either xmax or ymax, whichever is hit first). Would you want me to try implementing something? I'm not sure on the exact flow of things, but I might be able to prototype something... let me know!

This is pretty easy to do using the scale_hjust_manual function, which was included specifically for this purpose.

Awesome and this is still an easier workaround with at most half the guesses required vs. my approach in the example :)

Thanks for the quick help and consideration!

AllanCameron commented 2 years ago

Thanks @jwhendy

Would you want me to try implementing something?

You are of course very welcome to clone the repo, make changes, and submit a pull request, but Teun and I are quite well immersed in the code base (and in particular the effects that changing one part might have on another), so if you wait a couple of days (or maybe even hours!) we'll see what we can do.

jwhendy commented 2 years ago

@AllanCameron indeed, exactly the sort of cost benefit analysis I was looking for :) I really appreciate it, and was super surprised to see the SO answer on a new package. This will certainly become one of my goto's like ggrepel since stumbling on it!

teunbrand commented 2 years ago

I'll take on this issue, there are some opportunities for refactoring here as well (unless Alan already has a working solution at the moment he reads this).

jwhendy commented 2 years ago

Just came here to link to the answer I posted with your solution, @AllanCameron . Couldn't help see the PR and close as well. You are fast!

I think this is CloseEnough, but I did re-install and re-run my example. What are your thoughts on the behavior of hjust = 1? I would intuitively have expected either:

I did not expect some to be some to be more visible than others, and some still off the plot border.

library(ggplot2)
library(geomtextpath)

set.seed(123)
df <- data.frame(
  x = runif(100, 0, 1),
  y = runif(100, 0, 1))
lines <- data.frame(
  intercept = rep(0, 5),
  slope = c(0.1, 0.25, 0.5, 1, 2))

p <- ggplot(df, aes(x = x, y = y)) +
  geom_point() +
  geom_abline(aes(intercept = intercept, slope = slope),
              linetype = "dashed", data = lines)
p + geom_textabline(aes(intercept = intercept, 
                        slope = slope,
                        label = as.character(slope)),
                        hjust = 1,
                    data = lines,
                    gap = FALSE,
                    text_only = TRUE,
                    offset = unit(0.2, "lines"),
                    color = "red")

grab_2022-01-18_164005

Using hjust=0.95 works great:

grab_2022-01-18_164033

AllanCameron commented 2 years ago

Yes John, @teunbrand has come up with the goods even quicker than I expected!

The "2" is off the page simply because its vertical justification nudges it beyond the plotting margins. If you use a vjust of 0.5 or larger you should see it popping into view.

I think I will close this issue for now, as I am much happier with the new behaviour and it had a lower overhead than expected.

Many thanks for bringing this to our attention John - it's useful to get feedback to help iron out these early bugs.

jwhendy commented 2 years ago

The "2" is off the page simply because its vertical justification nudges it beyond the plotting margins. If you use a vjust of 0.5 or larger you should see it popping into view.

I didn't see a change with hjust=1, vjust=1, and others are also partially off the page, but I also completely agree with:

I think I will close this issue for now, as I am much happier with the new behaviour and it had a lower overhead than expected.

Indeed, works for me and this is easily 95% improved on the behavior of hjust. Much appreciated to the both of you!