grantmcdermott / tinyplot

Lightweight extension of the base R graphics system
https://grantmcdermott.com/tinyplot
Apache License 2.0
208 stars 7 forks source link

Continuous legend #84

Closed grantmcdermott closed 4 months ago

grantmcdermott commented 7 months ago

For numeric variables with many categories.

See suggestions here: https://stackoverflow.com/questions/13355176/gradient-legend-in-base

grantmcdermott commented 5 months ago

I was playing around with some ideas over the weekend and I think it's possible to a hack this approach in way that (a) integrates with the existing plot2::draw_legend() logic and (b) looks pretty reasonable.

Proof of concept:

library(plot2)

# data and input columns

dataset = iris
xlab = "Petal.Length"
ylab = "Sepal.Length"
collab = "Sepal.Length"

xvar = dataset[[xlab]]
yvar = dataset[[ylab]]
colvar = dataset[[collab]]

# color palette choices and funcs

ncolors = 101L
nlabs = 5L
pal = hcl.colors(ncolors, "inferno", alpha = 1)
palramp = colorRampPalette(pal, alpha = TRUE)

# colors for the actual plot points
cols = palramp(length(colvar))
## Note: we reorder now so we can assign colors as part of the points()
## call later on. (Saves having to split by a continuous variable.)
colord = findInterval(colvar, sort(colvar))
cols = cols[colord]
## NB! This code will have to change for the formal implementation,
## since it assumes the colour variable is uniformly distributed.
## (i.e., it scales by order, not proportional to the actual value.)

# legend color ramp

colvar_range = range(colvar)
pcolvar = pretty(colvar_range, n = nlabs)
pcolvar = pcolvar[pcolvar >= colvar_range[1] & pcolvar <= colvar_range[2]]
if (length(pcolvar)>nlabs) pcolvar = pcolvar[-1] # optional
colvar2 = seq(colvar_range[1], colvar_range[2], length.out = ncolors)
colvar2 = sort(unique(c(colvar2, pcolvar)))
pidx = findInterval(pcolvar, colvar2)
lgd = rep(NA, times = length(colvar2))
lgd[pidx] = pcolvar
lgd = rev(lgd)
lcols_all = rev(palramp(length(colvar2)))
## Add "padding" on either side, otherwise the y.intersp adjustment
## below will cause the labels to look funny
lgd = c(NA, lgd, NA)
lcols_all = c(NA, lcols_all, NA)

## draw the legend using our bespoke color ramp and pretty labels

# adjust vertical legend spacing (to compress as there were only 'nlabs' discrete
# colours in the legend)
intersp_adj = rep(1 / ncolors * nlabs, times = length(lcols_all))
intersp_adj[1] = 1
intersp_adj[length(intersp_adj)] = 1

# now draw the legend
draw_legend(
    legend = "right!",
    lgnd_labs = lgd,
    legend.args = list(
        title = collab,
        # bty = "o",
        y.intersp = intersp_adj,
        adj = c(0,0),
        pt.cex = 3.5
    ),
    type = "p",
    pch = 22,
    col = NA, #lcols_all,
    bg = lcols_all
)

# Manually add the points and other plot region elements
# (mostly to show what the final plot would look like)

plot.window(xlim = range(xvar), ylim = range(yvar))
axis(1); axis(2); grid()
title(xlab = xlab, ylab = ylab, main = "Continuous legend")
points(
    x = xvar, y = yvar,
    pch = 19,
    col = cols
)

Created on 2024-01-31 with reprex v2.1.0

grantmcdermott commented 5 months ago

Better version that uses a rescaling function to get the colour mapping right. (Helps to avoid finicky label matching code too.)

library(plot2)

# rescaling function
rescale_func = function (x, from = NULL, to = NULL) {
    if (is.null(from)) from = range(x)
    if (is.null(to)) to = c(1, 100)
    (x - from[1])/diff(from) * diff(to) + to[1]
}

# data and input columns

dataset = iris
xlab = "Petal.Length"
ylab = "Sepal.Length"
collab = "Sepal.Length"

xvar = dataset[[xlab]]
yvar = dataset[[ylab]]
colvar = dataset[[collab]]

# color palette choices and funcs

ncolors = 100L
nlabs = 5L
pal = hcl.colors(ncolors, "inferno", alpha = 1)
palramp = colorRampPalette(pal, alpha = TRUE)

# generate 'ncolors' (here: 100) distinct color categories for the plot
cols = palramp(ncolors)

# legend color ramp

## Identify the pretty break points for our labels
ucolvar = unique(colvar)
colvar_range = range(ucolvar)
pcolvar = pretty(colvar_range, n = nlabs)
pcolvar = pcolvar[pcolvar >= colvar_range[1] & pcolvar <= colvar_range[2]]
# optional thinning
if (length(ucolvar)==2 && all(ucolvar %in% pcolvar)) {
    pcolvar = ucolvar
} else if (length(pcolvar)>nlabs) {
    pcolvar = pcolvar[seq_along(pcolvar) %% 2 == 0]
}

## Find the (approximate) location of our pretty labels
pidx = rescale_func(c(colvar_range, pcolvar), to = c(1, ncolors))[-c(1:2)]
pidx = round(pidx)
lgd_labs = rep(NA, times = length(cols))
lgd_labs[pidx] = pcolvar
# We have to reverse the order since the legends are in decreasing sequence
lgd_labs = rev(lgd_labs)
lgd_cols = rev(cols)

# Add "padding" on either side, otherwise the y.intersp adjustment
# below will cause the labels to look funny
lgd_labs = c(NA, lgd_labs, NA)
lgd_cols = c(NA, lgd_cols, NA)

## draw the legend using our bespoke color ramp and pretty labels

intersp_adj = rep(1 / ncolors * nlabs, times = length(lgd_cols))
intersp_adj[1] = 1
intersp_adj[length(intersp_adj)] = 1
draw_legend(
    legend = "right!",
    lgnd_labs = lgd_labs,
    legend.args = list(
        title = collab,
        # bty = "o",
        y.intersp = intersp_adj,
        adj = c(0,0),
        pt.cex = 3.5
    ),
    type = "p",
    pch = 22,
    col = NA,
    bg = lgd_cols
)

# Manually add the points and other plot region elements
# (mostly to show what the final plot would look like)

plot.window(xlim = range(xvar), ylim = range(yvar))
axis(1); axis(2); grid()
title(xlab = xlab, ylab = ylab, main = "Continuous legend")
points(
    x = xvar, y = yvar,
    pch = 19,
    col = cols[round(rescale_func(colvar))]
)

Created on 2024-02-01 with reprex v2.1.0

grantmcdermott commented 5 months ago

Note to self: This intersp trick probably won't work for horizontal legends (incl. "top!" and "bottom!"). We can just exclude those for now.

grantmcdermott commented 5 months ago

@zeileis As our resident colour expert: Do you have a recommendation for the default continuous palette? Right now, it's "viridis" but I'm not the biggest fan TBH. Do you perhaps think "cividis", or one of the single hue HCL colors like "blues3" would be a better choice? Something else?

Background: I have the basic code working for native continuous legends in tinyplot now. So it seems a good time to decide the default palette. Happy to share some screenshots if that's helpful.

zeileis commented 5 months ago

It depends what you want to use the palette for.

In short: For shading areas I would use YlGnBu. For points and lines I would go with a multi-hue sequential palette with a reasonable range of hues and relatively high chroma throughout. Maybe Mako or Rocket are a bit less busy compared to viridis.

I'm happy to play around with this a bit if you provide code for a couple of typical examples you have in mind.

grantmcdermott commented 5 months ago

Thanks @zeileis! Right now, we don't support heatmaps, so it's mostly for points and similar plot types. (Exception: Adding continuous color support for line plots is v. tricky with the base graphics scaffolding, so I don't plan on supporting continuous legends for line plots yet.)

I do like the Mako and Rocket palettes—and use them a lot in my own plots—so just from a pure aesthetic perspective it would be between those and Cividis for me. But I defer to your expert knowledge...

Let me tidy up this prototype code a little bit and push the changes to the continuous-legend branch that I'm working on. I'll ping you here when it's ready, so you can pull and play around with it on your local machine.

zeileis commented 5 months ago

Re: heatmaps. We don't support them _yet_ ;-) More seriously, for me the question is more whether we will have a conceptual difference somewhere for default continuous palette for shading areas vs. points/lines. Or whether the defaults are set per function anyway.

Re: cividis. My personal opinion is that this is a very bad choice for almost all situations. It collapses many color contrasts that would help trichromats (with "normal" color vision) to better distinguish the colors in a palette. Also, it's a weird combination between sequential and diverging palette.

Re: pinging later. :rocket:

grantmcdermott commented 4 months ago

Hey @zeileis I ended up doing a bit more work on the branch, so that it was ready for PR in #122. Even though I've marked it as [WIP], it's ready to kick the tires. (I just need to clean up some things internally and add tests, etc.) Feel free play around and lmk what you think.