r-tmap / tmap

R package for thematic maps
https://r-tmap.github.io/tmap
GNU General Public License v3.0
865 stars 121 forks source link

Color palettes: native support for the `pals` package? #566

Closed mtennekes closed 1 year ago

mtennekes commented 3 years ago

Currently, tmap natively support palettes from color brewer and viridis. I.e. one can specify palette = "Blues", and it automatically selects the required number of colors from the color brewer palette "Blues". The same for viridis palettes. A small novelty is that users are able to reverse the palette by putting a minus sign in front of the palette name.

How shall we do this in tmap v4?

There are (at least) two recent developments that are worthwhile to consider:

  1. The package pals seems the way to go. I wasn't aware of this package until about a year ago. It contains a very rich and concise set of color palettes, including color brewer and viridis. There are lots of other palettes that work very well with maps. For instance kovesi.rainbow (run pal.map(pals::kovesi.rainbow(10))) and there are palettes for bivariate choropleths (see https://nowosad.github.io/post/cbc-bp2/).
  2. As of R 4.0, the standard palettes have been improved. See https://developer.r-project.org/Blog/public/2019/11/21/a-new-palette-for-r/ There are some useful palettes for spatial data as well, but most (all?) are also included in pals. What I dislike is that there the palette names contain white spaces.

If we go with pals:

Nowosad commented 3 years ago
  1. I would think that keeping things simple would be the best choice here. It could be done either by using the build-in R tools (especially palette.colors() and hcl.colors(), see examples below) or just using one external package that could be extended in the future. That being said, I see the pals package's value, especially in terms of bi-variate palettes.
  2. There is also the paletteer package - https://github.com/EmilHvitfeldt/paletteer.
  3. I looked at the pals code for some time, but cannot find anything similar to RColorBrewer::brewer.pal.info nor RColorBrewer::display.brewer.all(). If you decide on using pals, it could be good to get in touch with the package author to discuss some additional/missing features.
palette.swatch <- function(palette = palette.pals(), n = 8, nrow = 8,
                           border = "black", cex = 1, ...)
{
  cols <- sapply(palette, palette.colors, n = n, recycle = TRUE)
  ncol <- ncol(cols)
  nswatch <- min(ncol, nrow)
  op <- par(mar = rep(0.1, 4),
            mfrow = c(1, min(5, ceiling(ncol/nrow))),
            cex = cex, ...)
  on.exit(par(op))
  while (length(palette)) {
    subset <- seq_len(min(nrow, ncol(cols)))
    plot.new()
    plot.window(c(0, n), c(0.25, nrow + 0.25))
    y <- rev(subset)
    text(0, y + 0.1, palette[subset], adj = c(0, 0))
    y <- rep(y, each = n)
    rect(rep(0:(n-1), n), y, rep(1:n, n), y - 0.5,
         col = cols[, subset], border = border)
    palette <- palette[-subset]
    cols    <- cols [, -subset, drop = FALSE]
  }
}

palette.swatch()

hcl.swatch <- function(type = NULL, n = 5, nrow = 11,
                       border = if (n < 15) "black" else NA) {
  palette <- hcl.pals(type)
  cols <- sapply(palette, hcl.colors, n = n)
  ncol <- ncol(cols)
  nswatch <- min(ncol, nrow)

  par(mar = rep(0.1, 4),
      mfrow = c(1, min(5, ceiling(ncol/nrow))),
      # pin = c(1, 0.5 * nswatch),
      cex = 0.7)

  while (length(palette)) {
    subset <- 1:min(nrow, ncol(cols))
    plot.new()
    plot.window(c(0, n), c(0, nrow + 1))
    text(0, rev(subset) + 0.1, palette[subset], adj = c(0, 0))
    y <- rep(subset, each = n)
    rect(rep(0:(n-1), n), rev(y), rep(1:n, n), rev(y) - 0.5,
         col = cols[, subset], border = border)
    palette <- palette[-subset]
    cols <- cols[, -subset, drop = FALSE]
  }

  par(mfrow = c(1, 1), mar = c(5.1, 4.1, 4.1, 2.1), cex = 1)
}
hcl.swatch("qualitative")

hcl.swatch("sequential")

hcl.swatch("diverging")

hcl.swatch("divergingx")

zeileis commented 3 years ago

I'm of course biased because I co-wrote hcl.colors() and palette.colors() but I wanted to add two small comments to the discussion:

hcl.colors(3, "Light Grays")
## [1] "#474747" "#A8A8A8" "#E2E2E2"
hcl.colors(3, "lightgray")
## [1] "#474747" "#A8A8A8" "#E2E2E2"
palette.colors(3, "Okabe-Ito")
##     black    orange   skyblue 
## "#000000" "#E69F00" "#56B4E9" 
palette.colors(3, "okabe")
##     black    orange   skyblue 
## "#000000" "#E69F00" "#56B4E9" 
fx <- function(x) tolower(gsub("[-, _, \\,, (, ), \\ , \\.]",  "", x))
charmatch(fx(name), fx(hcl.pals()))

If I can help with anything else regarding these palettes, let me know.

carbonmetrics commented 3 years ago

General comment: color palettes often do not take into account that 8% (eight!) percent of males have green-red color blindness. Paul Tol has put a lot of thought on color schemes that are: "...distinct for all people, including colour-blind readers; distinct from black and white; distinct on screen and paper; matching well together." His technical note can be found here and has been implemented in R package khroma.

zeileis commented 3 years ago

Yes, indeed, this is an important issue. It was also an important part in the consideration for the new default palette() in R starting from version 4.0.0 and in the default palette returned by palette.colors() (namely Okabe-Ito). Also, many of the HCL-based palettes in hcl.colors() are very robust for color-blind viewers. We tried to convey this in the corresponding documentation, that I have linked below, which contain further links and references to the literature.

Having said that, it is also clear that even with such perceptually-based palettes it is not always straightforward to pick colors that work well for all audiences. It helps to have good palette functions available but they are no "magic bullet".

mtennekes commented 3 years ago

Thanks for your input @zeileis and @carbonmetrics!

For the end-users it is crucial that they can have a visual overview of the available palettes (like RColorBrewer::display.brewer.all()). This could either be an upgrade of tmaptools::palette_explorer, or an implementation of such "catalogization" functions.

@carbonmetrics and @nfrerebeau Awesome! I am a bit reluctant to rely on khroma since it imports additional packages (but the same applies for pals).

I noticed that the palette names of palette.colors() and hcl.colors() on the one hand and those from pals on the other hand do intersect each other. For that matter we could use both.

I am also considering the following workflow. By default: only base (grDevices) palettes are supported. But once the user runs tmap_use("pals") the palettes from pals become available. And the same for tmap_use("khroma").

We can keep the discussion ongoing. There is no need to make decisions at the moment, since I'm still working (and slowly but nicely progressing) with the foundations of tmap v4.

carbonmetrics commented 3 years ago

Just to be sure, I am not promoting khroma, but the work of Paul Tol.

zeileis commented 3 years ago

The functions hcl.swatch() and palette.swatch() are only on the manual page because we felt that (a) it was unclear what exactly the specs of the function should be, (b) it was easy enough to run example(palette.colors) or look on the web pages, and (c) we wondered whether it would be important to have only one function that displays all the different *.colors() functions.

But I would be interested to hear what your thoughts on such a function would be. Then I can raise it with Paul and maybe we can put something into a future version of R.

Last comment: I'm not sure whether the unavailability of built-in *.swatch() functions is really a major obstacle in the use of these palettes. My feeling is that there are essentially two camps of users: (a) I Know What I Like users, many of whom keep on using RColorBrewer and/or viridis(Lite) and (b) The More The Merrier users who favor pals or paleteer. Our approach tries to be somewhere in between (as already pointed out in another posting above) because we try to provide many palettes but not too many. The list may also keep growing slowly in the future. For example, we have added a few of Crameri's palettes (notably Batlow) and the new viridis palettes (Mako and Rocket).

mtennekes commented 3 years ago

I've made a metadata data.frame for internal use which contains metadata from three packages: grDevices, pals and rcartocolor. This metadata does not contain the palettes themselves, but function names how obtain them. This metadata data.frame is contained in sysdata.rda and is called .tmap_pals (in the branch v4).

fileb3230bbed09

Please let me know what you think! Ideas and suggestions are welcome. A part of me is still tempted to write a new R package for colors, which is light-weight (no dependencies), and contains a huge collection of palettes but well organized.

Nowosad commented 3 years ago

Hey @mtennekes, my few comments:

  1. One regular users' problem with palette_explorer() was that it blocked the R session. Therefore, I like the new version, which is just a viewer panel not interfering with an R session...
  2. I think you can remove the "carto" palettes. I believe all of them are already implemented in hcl.colors(). @zeileis is that correct? EDIT: I could be also the case for "brewer" and "viridis"...
zeileis commented 3 years ago

hcl.colors() contains HCL approximations of all sequential and diverging palettes from rcartocolors. Analogously for viridis and RColorBrewer. See: http://colorspace.R-Forge.R-project.org/articles/approximations.html

mtennekes commented 3 years ago

Thanks! I still see the benefits of having "brewer" and "viridis", but the "carto" palettes seem to be more redundant.

The categorical "carto" palettes are unique, but I am totally fine with dropping these. Personally, I like the brewer and tableau categorical palettes better.

Nowosad commented 3 years ago

@mtennekes, I can only suggest you consider keeping the carto Safe palette. It is quite all right for people with color deficiencies - https://nowosad.github.io/colorblindcheck/articles/articles/check_rcartocolor.html#safe.

mtennekes commented 3 years ago

A follow-up on the color palettes discussion in #593, in particular on color-blind-friendly color palettes https://github.com/r-tmap/tmap/issues/593#issuecomment-913585559

I've checked all tmap palettes (that are currently in v4). For each palette, I looked up the minimum distance, which I think is the most straightforward (and also best?) metric to score palettes. I do this for normal "norm" and for the cvd categories grouped ("cvd") where I take the minimum of the three.

library(colorblindcheck)
library(tidyverse)
library(tmap) # version 4 needed
#> 
#> Attaching package: 'tmap'
#> The following object is masked from 'package:datasets':
#> 
#>     rivers

pal_names = tmap:::.tmap_pals$fullname
pal_values = lapply(pal_names, FUN = tmap:::tmapGetPalette, n = 7)

pal_scores = round(as.data.frame(t(sapply(pal_values, function(p) {
    res = colorblindcheck::palette_check(p, plot = FALSE)
    c(norm = res$min_dist[1], cvd = min(res$min_dist[-1]))
}))))
tmap_pals = cbind(tmap:::.tmap_pals, pal_scores)

for (typ in c("cat", "seq", "div")) {
    cat("\nType: ", typ, "------------------------------\n")
    tmap_pals %>%
        filter(type == typ) %>% 
        select(name, series, maxn, fullname, norm, cvd) %>% 
        arrange(desc(cvd)) %>% 
        head(5) %>% 
        print()
}
#> 
#> Type:  cat ------------------------------
#>       name  series maxn          fullname norm cvd
#> 1      tol    misc   12         misc__tol   20  12
#> 2 okabeito palette    9 palette__okabeito   22  11
#> 3    okabe    misc    8       misc__okabe   22  11
#> 4    kelly    misc   22       misc__kelly   18   9
#> 5 stepped2    misc   20    misc__stepped2   10   9
#> 
#> Type:  seq ------------------------------
#>                           name series maxn                             fullname
#> 1                     warmcool   misc  Inf                       misc__warmcool
#> 2                         oslo    hcl  Inf                            hcl__oslo
#> 3                    ocean.ice  ocean  Inf                     ocean__ocean.ice
#> 4 kovesi.linear_green_5_95_c69 kovesi  Inf kovesi__kovesi.linear_green_5_95_c69
#> 5                      lajolla    hcl  Inf                         hcl__lajolla
#>   norm cvd
#> 1   14  14
#> 2   13  13
#> 3   14  11
#> 4   12  11
#> 5   12  10
#> 
#> Type:  div ------------------------------
#>          name series maxn         fullname norm cvd
#> 1        broc    hcl  Inf        hcl__broc   19  18
#> 2      lisbon    hcl  Inf      hcl__lisbon   17  18
#> 3         vik    hcl  Inf         hcl__vik   21  16
#> 4        prgn    hcl  Inf        hcl__prgn   21  16
#> 5 purplegreen    hcl  Inf hcl__purplegreen   22  15

Created on 2021-09-08 by the reprex package (v2.0.0)

Although this is helpful, it is certainly not conclusive:

I'd like to make a selection of recommended color palettes, which we can call the "tmap" series. These palettes should:

zeileis commented 3 years ago

Thanks for all this work, very much appreciated!

A couple of comments regarding suitability of the palettes - also in combination with potential color vision deficiencies:

The luminance aspects for sequential and diverging palettes are relatively easy to check. I did this for the sequential palettes in RColorBrewer, rcartocolor, scico, and viridis. The R-squared of the luminance gradient with a perfect straight line was above 0.95 for all palettes from all packages. The average luminance ranges differ somewhat between the packages: For RColorBrewer sequential palettes the average is 66, for rcartocolor 55, for scico 80, and for viridis 88. Some would argue that generally more is better here but this probably depends on the context and on how many colors need to be clearly distinguishable. For the diverging palettes the R-squared values with a perfect triangle are somewhat lower (notably due to unbalancedness between the two arms) but still all above 0.85. However, I did not try to formalize the hue checks for these palettes, yet. (Let me know if you are interested in the code. This is a bit too long for a comment here and I haven't, yet, wrapped it up elsewhere.)

A few more remarks about your suggestions:

mtennekes commented 3 years ago

Great observations, thanks!

We could (should?) wrap our color palette testing scripts into functions and put them in a github repo. Perhaps we can merge them with https://github.com/Nowosad/colorblindcheck (so making this package a bit more general). What do you think @Nowosad ?

Regarding the last point, have you seen the best paper award paper from Eurovis 2021? See https://diglib.eg.org/bitstream/handle/10.1111/cgf14288/v40i3pp049-060.pdf Very interesting and quite the opposite from what most experts (including us) advise.

zeileis commented 3 years ago
mtennekes commented 3 years ago

Agree. I was mentioning color palettes from paintings just as an example of harmonic colors, not saying that it is therefore always good for statistical purposes. I like the explanation and examples in your video about that. The analogy with musical scales also holds here: some nodes in the C major scale attract more attention than others (e.g. the major third and major seventh). I cannot think of a musical scale in which all nodes stand out equally, perhaps the chromatic scale?

Yes, the experiments of the Eurovis2021 paper are indeed limited. That is also what they explained in their presentation. Even though I was (and still am) skeptic about using such flashy 'HSV' palettes it is still an eye opener to me that for some tasks it performed better.

zeileis commented 3 years ago

My feeling is that it is the implicit categorization that makes spotting the odd one out easier. I would have like to see a comparison with:

Nowosad commented 3 years ago

We could (should?) wrap our color palette testing scripts into functions and put them in a github repo. Perhaps we can merge them with https://github.com/Nowosad/colorblindcheck (so making this package a bit more general). What do you think @Nowosad ?

Great idea. PR will be very welcomed - and let me know if you need anything from me.

mtennekes commented 3 years ago

@zeileis Technical (and slightly off topic) question: do you know what the fastest way is to convert a vector of named colors to hexadecimal colors? It can be done via col2rgb and rgb but that is not very fast for many colors.

tim-salabim commented 3 years ago

@mtennekes col2rgb and rgb is what we use in mapview (https://github.com/r-spatial/mapview/blob/master/R/color.R#L123-L134). I don't find it so slow to be honest.

zeileis commented 3 years ago

All the functions that I am aware of also use col2rgb() and rgb() under the hood. Both are .Calls to C_col2rgb and C_rgb internally. So I would have expected them to be reasonably fast. Have you had a closer look what exactly takes relatively long?

mtennekes commented 3 years ago

Thanks @tim-salabim and @zeileis . It was inefficient programming on my side: I had to do the color-to-hex conversion earlier in the process. Now it works fine. (But still, a = col2rgb(rep("red", 1e6)) takes a few seconds, which I wouldn't expect)

zeileis commented 3 years ago

Good point. When you have millions of colors then indexing them would probably be a good idea. You could also have a look at the underlying C code to see whether this could be changed relatively easily. I think that Brodie Gaslam made a similar change in convertColor to speed it up for very large sets of colors.