Closed teunbrand closed 2 years ago
This might be a nice touch. Though "flat" isn't the opposite of "steep" in this context. It's the opposite of "curved", so for example, in your economics example, the peaks are steep but they are also the "flattest" regions for text to be placed.
You're right, I was thinking about horizontal and vertical instead, which also might be neat options, but it is indeed not necessarily the flattest part. Your comment about using the curvature for placing the label on the flattest part makes way more sense to me now.
I've made some progress with the automatic label placement on the flattest areas of a plot, using a modified rolling mean of curvature which finds the least curved section and sets the hjust as the proportion of the arclength at that point. I have created a little function that processes the data frame from inside geom textpath, but this could maybe be called from inside textpathGrob and made more efficient (since it uses the split-apply-bind method). Really just a proof of concept at the moment, but it seems to work pretty well:
library(geomtextpath)
#> Loading required package: ggplot2
df <- data.frame(x = 1:100, y = cos(seq(0, 2 * pi, len = 100)),
label = "A text label of moderate length.")
ggplot(df, aes(x, y, label = label)) + geom_textpath()
ggplot(df, aes(x, y, label = label)) + geom_textpath(hjust = "auto")
set.seed(1)
df <- data.frame(x = rnorm(100), y = rnorm(100))
ggplot(df, aes(x, y)) + geom_labeldensity2d()
ggplot(df, aes(x, y)) + geom_labeldensity2d(hjust = "auto")
It even finds a place for your label in the difficult economics example:
p <- ggplot(economics, aes(date, unemploy)) +
geom_path(colour = "grey")
p + geom_textpath(
aes(label = "Decline", group = 1),
hjust = "auto", size = 5, include_line = FALSE)
Created on 2021-12-10 by the reprex package (v2.0.0)
Yes that does seems to work pretty good! I don't really worry about efficiency outside of the makeContent code as it doesn't need to run every time the user resizes their window (but all else being equal, more efficiency is better than less efficiency). The only reason I can see to run this from within the makeContent code is because then we can know the exact text width the choose an optimal window for calculating the running mean and get the appropriate curvature.
I tried testing whether the point of minimum curvature is stable under aspect ratio deformation, but this appear to be not the case.
set.seed(42)
# Random walk
x <- cumsum(rnorm(200))
y <- cumsum(rnorm(200))
plot(x, y, type = 'l')
# Aspect ratios to test
asp <- seq(1, 5, length.out = 100)
# Calculate curvature for every ratio
curv <- vapply(asp, function(mult) {
geomtextpath:::.get_curvature(x * mult, y)
}, numeric(length(x)))
# Visualise curvature
image(
list(y = asp, x = 1:200, z = curv),
useRaster = T, col = hcl.colors(255, "YlOrRd", rev = TRUE)
)
# Not always minima are the same point
min_curv <- apply(curv, 2, which.min)
all(min_curv == min_curv[1])
#> [1] FALSE
Created on 2021-12-10 by the reprex package (v2.0.1)
However, there aren't many minima in the example above (just 3) and if you use set.seed(0)
there is only a single one, so my guess is that the minimum is relatively stable under deformation? (update I tested 100 seeds and in 37 of them they had 1 minimum).
No, curvature isn't stable under aspect ratio changes - A circle has fixed curvature all the way round, but if you change the aspect ratio you get an ellipse, which has higher curvature in one dimension than the other.
I've moved the auto hjust inside the makeContent mechanism (it's now inside the anchor points function). It seems to work pretty well
library(geomtextpath)
#> Loading required package: ggplot2
df <- data.frame(x = rep(sin(seq(0, 2*pi, len = 100)), 2),
y = rep(cos(seq(0, 2*pi, len = 100)), 2),
z = rep(c("A", "B"), each = 100),
label = "I think this is the flattest part of the curve")
p <- ggplot(df, aes(x, y, group = z, label = label)) +
geom_textpath(vjust = 1.2, size = 6, hjust = "auto")
p + facet_grid(z~.)
p + facet_grid(.~z)
Created on 2021-12-11 by the reprex package (v2.0.0)
"xmin"/"xmax"/"xmid" for placement at the leftmost/rightmost or middle horizontal position on the curve.
I like these, as they are stable under aspect ratio changes. Could it be generalized for all xpos/ypos? I know right now I have some plots that I have to adjust hjust
whenever I resize them, either directly, or indirectly via adding or removing legends, titles, etc. Such an option would be very useful for them.
This is not an exhaustive list, but these came to mind.
A probably tricky-to-implement idea: avoid the other textpaths from the other groups/colors. Something like that would be great for the plot that I used when asking the original question.
I have implemented the positions mentioned above (though "auto" is just "flattest"). I will leave this issue open until we have had a play and some testing. The "check overlap" that @byteit101 mentions is probably a separate issue
library(geomtextpath)
#> Loading required package: ggplot2
p <- ggplot(iris, aes(x = Sepal.Length, group = 1))
p + geom_textpath(aes(label = "Default"), stat = "density", size = 6)
p + geom_textpath(aes(label = "auto"), stat = "density", size = 6,
hjust = "auto")
p + geom_textpath(aes(label = "xmin"), stat = "density", size = 6,
hjust = "xmin")
p + geom_textpath(aes(label = "xmid"), stat = "density", size = 6,
hjust = "xmid")
p + geom_textpath(aes(label = "xmax"), stat = "density", size = 6,
hjust = "xmax")
p + geom_textpath(aes(label = "ymin"), stat = "density", size = 6,
hjust = "ymin")
p + geom_textpath(aes(label = "ymid"), stat = "density", size = 6,
hjust = "ymid")
p + geom_textpath(aes(label = "ymax"), stat = "density", size = 6,
hjust = "ymax")
Created on 2021-12-12 by the reprex package (v2.0.0)
The "ymax" setting is actually pretty useful:
ggplot(iris, aes(x = Sepal.Length, colour = Species)) +
geom_textpath(aes(label = Species), stat = "density",
size = 6, fontface = 2, hjust = "ymax", vjust = -0.2)
This look great! Out of curiosity, in the ymid
case, is the left/right choice arbitrary or determined by something?
I thought you might like this thread: https://twitter.com/timelyportfolio/status/1469683836107866120.
Ah...I had noticed that the repo's stars had more than doubled in 24h but couldn't figure out why. Now I know!
The ymid
literally finds the point on the path nearest the mean y value.
I can't figure out why the text isn't centered over the peaks on the y max setting. I'll have a look at this and refactor the code (it's unnecessarily repetitive), plus write some tests before closing this issue.
The text wasn't centered over the peak because the default halign was "left", so any vjust below 0.5 pushed the text so it would be in line with the first letter of a string nicely centered on the peak with a vjust of 0.5. I have switched the default to "center", since I am guessing that positioning single-line labels is a more common task than using multi-line labels, and in any case the user can change the halign if printing multi-line text. It seems unreasonable to expect the casual user to know that they should change the halign to correctly position single-line text.
I have added tests for this and we're back at 100% code coverage. The results look as expected on all 3 geoms, so I'll close this issue for now.
We were discussing in #27 that it might be convenient to have labels placed at some position. In particular, we were discussing
hjust = "auto"
for placing the label at the flattest part of the curve, but that got me thinking about other placement rules. I think the following keywords for thehjust
parameter make sense:"flattest"
, as we discussed before: at the flattest part of the curve. We can also do"steepest"
to do the inverse, but that makes less sense to me."xmin"
/"xmax"
/"xmid"
for placement at the leftmost/rightmost or middle horizontal position on the curve."ymin"
/"ymax"
/"ymid"
for placement at the top, bottom or middle vertical position on the curve."before"
/"after"
for doing the equivalent ofhjust = -1
orhjust = 2
, where the text is anchored a textwidth away from the stated anchor point. This would translate to before the start of the curve or after the end of the curve (where the angles would be extrapolated based on the first/last angle). Consequently, this text would be flat and we can use the simplified per-string placement instead of the per-character placement.This is not an exhaustive list, but these came to mind.