tidyverse / ggplot2

An implementation of the Grammar of Graphics in R
https://ggplot2.tidyverse.org
Other
6.51k stars 2.03k forks source link

Could the `scale_*_steps*` functions support applying the full range of colour provided? #6068

Open davidhodge931 opened 2 months ago

davidhodge931 commented 2 months ago

The scale_*_steps* functions currently do not apply the full range of colours to a plot.

See example below of scale_colour_binned, which applies the full range of the viridis colour palette compared to adding viridis colours to scale_colour_stepsn which does not.

Generally, you need as big a range of colour as possible when colouring a numeric variable.

Could the scale_*_steps* functions apply the full range of colour by default to the plot?

library(tidyverse)
library(palmerpenguins)
library(patchwork)

p1 <- penguins |>
  ggplot() +
  geom_point(aes(x = flipper_length_mm, y = body_mass_g, col = flipper_length_mm, )) +
  scale_colour_binned(type = "viridis") +
  labs(title = "binned")

p2 <- penguins |>
  ggplot() +
  geom_point(aes(x = flipper_length_mm, y = body_mass_g, col = flipper_length_mm, )) +
  scale_colour_stepsn(colours = viridis::viridis(9)) +
  labs(title = "stepsn")

p1 + p2
#> Warning: Removed 2 rows containing missing values or values outside the scale range
#> (`geom_point()`).
#> Removed 2 rows containing missing values or values outside the scale range
#> (`geom_point()`).

Created on 2024-09-01 with reprex v2.1.1

teunbrand commented 2 months ago

I've seen some SO posts around this, and I think having the option to do this is probably a good idea. I don't think we should make it a default for backward compatibility reasons though.

teunbrand commented 1 month ago

OK this is more nuanced than I originally thought. Essentially:

The difference becomes much clearer when you have enevenly distributed breaks:

library(ggplot2)
library(patchwork)

p <- ggplot(mpg, aes(displ, hwy, colour = cty)) +
  geom_point()

breaks <- c(8, 10, 12, 16, 20, 24)

p1 <- p +
  scale_colour_binned(type = "viridis", breaks = breaks) +
  labs(title = "binned")

p2 <- p +
  scale_colour_stepsn(colours = viridis::viridis(9), breaks = breaks) +
  labs(title = "stepsn")

p1 + p2

Created on 2024-09-16 with reprex v2.1.1

In order to take the 'full range' of the continuous scale, one shouldn't rescale the values using the limits, but the most extreme breaks. Unfortunately, the plumbing isn't setup to deal with this, so it is harder to do than anticipated.

davidhodge931 commented 1 month ago

Ah, I see - thanks for the explanation.

I think it is a valid and common use-case to want to colour bins of your own custom colours and apply that colour palette in a discrete way like scale_colour_binned does.

Feel free to close