clauswilke / dataviz

A book covering the fundamentals of data visualization
https://clauswilke.com/dataviz
Other
3.2k stars 701 forks source link

Add subsection about memorable figure. #48

Closed clauswilke closed 6 years ago

clauswilke commented 6 years ago

In the "telling a story" figure, it might make sense to add a brief section called "Make a memorable figure". Research has shown that embellished figures can be more memorable than plain figures (e.g.: Bateman et al. 2010). I just need a good idea for an embellished figure.

clauswilke commented 6 years ago

This would also be the right place to discuss the various research literature mentioned in issue #49.

steveharoz commented 6 years ago

Some of the Otto Neurath graphs may make for useful examples. E.g. this one

I've never found a reasonably simple way to implement them with ggplot, but you might have a better understanding of R's graphics.

clauswilke commented 6 years ago

I've been thinking along those lines today, e.g. visualizing the msleep dataset using the animal silhouettes available here: http://phylopic.org/

Just not sure how much effort I want to put towards writing a geom that can do this. It's likely at least a week of work.

clauswilke commented 6 years ago

To implement this, we'd have to take geom_col() and replace the drawing subroutine with a function that covers the area of each bar with tiles of images. I see no technical hurdle to doing this. Just requires a bit of calculation to get this right without distorting the aspect ratio of the images and to make it responsive if the image is resized.

steveharoz commented 6 years ago

I wonder if you could modify geom_dotplot to allow setting the shape by category. Here's a dumb implementation with geom_point:

data = tibble(value = c(1:5, 1:3), category = c(rep('a', 5), rep('b', 3)))

ggplot(data) +
  aes(x=value, y=category, color=category, fill=category, shape=category) +
  geom_point(size=40) +
  scale_shape_manual(values=c("\u26C4", "\u26C5")) +
  expand_limits(x=c(0.5, 5.5)) +
  theme_classic()

image

Unfortunately, this approach is limited to symbols in fonts, which even with packages like font awesome is not very extensible.

clauswilke commented 6 years ago

I think I'll just write a grob that can tile an arbitrary rectangular area with an image. This will also address the longstanding request of being able to draw bars with textured fill. https://stackoverflow.com/questions/2895319/how-to-add-texture-to-fill-colors-in-ggplot2

clauswilke commented 6 years ago

Something like this is going to do the job. The rest is just filling in the blanks and adding a nice API.

library(grid)
library(magick)
#> Linking to ImageMagick 6.9.9.39
#> Enabled features: cairo, fontconfig, freetype, lcms, pango, rsvg, webp
#> Disabled features: fftw, ghostscript, x11

get_asp <- function(img) {
  info <- image_info(img)
  info$height / info$width
}

texture_grob <- function(x = unit(0.5, "npc"), y = unit(0.5, "npc"),
                         width = unit(1, "npc"), height = unit(1, "npc"),
                         img = NULL) {
  vp <- viewport(x, y, width, height, just = c(0, 0), clip = "on")
  g <- gTree(img = img, cl = "texture_grob", vp = vp)
}

makeContent.texture_grob <- function(x) {
  grob_width <- convertWidth(unit(1, "npc"), "in", valueOnly = TRUE)
  grob_height <- convertHeight(unit(1, "npc"), "in", valueOnly = TRUE)
  grob_asp <- grob_height / grob_width
  asp <- get_asp(x$img)
  n <- ceiling(grob_asp/asp) # number of image copies we need

  bg <- rectGrob(gp = gpar(fill = "#E8E8E8"))
  children <- lapply(
    1:n,
    function(i) {
      vp = viewport(
        x = unit(0, "in"), y = unit(i*asp*grob_width, "in"),
        width = unit(grob_width, "in"), height = unit(asp*grob_width, "in"),
        just = c(0, 1)
      )
      rasterGrob(x$img, vp = vp)
    }
  )
  # add background to children
  children <- c(list(bg), children)

  # convert to gList and set
  children <- do.call(gList, children)
  setChildren(x, children)
}

img <- magick::image_read("https://jeroen.github.io/images/Rlogo.png")

grid.newpage()
tg1 <- texture_grob(unit(.2, "npc"), unit(.05, "npc"), unit(.1, "npc"), unit(.9, "npc"), img)
tg2 <- texture_grob(unit(.5, "npc"), unit(.05, "npc"), unit(.3, "npc"), unit(.6, "npc"), img)
grid.draw(tg1)
grid.draw(tg2)

Created on 2018-09-23 by the reprex package (v0.2.0).

steveharoz commented 6 years ago

Nice! But because it's in npc units, changing the size or aspect ratio of the plot also changes the number of repetitions of the tiled image changes.

Running the same code with a different image size: image

Maybe render each unit as a rectangle with the image's aspect ratio. Use the same texture approach only to allow for partial images. Separating each unit has the added bonus of allowing you to adjust spacing.

I modified your example to account for the image aspect ratio, but it still doesn't compensate for the plot's aspect ratio.

library(grid)
library(magick)
#> Linking to ImageMagick 6.9.9.39
#> Enabled features: cairo, fontconfig, freetype, lcms, pango, rsvg, webp
#> Disabled features: fftw, ghostscript, x11

get_asp <- function(img) {
  info <- image_info(img)
  info$height / info$width
}

texture_grob <- function(x = unit(0.5, "npc"), y = unit(0.5, "npc"),
                         width = unit(1, "npc"), height = unit(1, "npc"),
                         img = NULL) {
  vp <- viewport(x, y, width, height, just = c(0, 0), clip = "on")
  g <- gTree(img = img, cl = "texture_grob", vp = vp)
}

makeContent.texture_grob <- function(x) {
  grob_width <- convertWidth(unit(1, "npc"), "in", valueOnly = TRUE)
  grob_height <- convertHeight(unit(1, "npc"), "in", valueOnly = TRUE)
  grob_asp <- grob_height / grob_width
  asp <- get_asp(x$img)

  # background square is useful for debugging
  bg <- rectGrob(gp = gpar(fill = "#E8E8E8", lty = "blank"))
  # draw one instance of the image
  vp <- viewport(
    x = unit(0, "in"), y = unit(asp*grob_width, "in"),
    width = unit(grob_width, "in"), height = unit(asp*grob_width, "in"),
    just = c(0, 1)
  )
  children <- list(rasterGrob(x$img, vp = vp))
  # add background to children
  children <- c(list(bg), children)

  # convert to gList and set
  children <- do.call(gList, children)
  setChildren(x, children)
}

draw_vertical_stack_of_images <- function (
    x = unit(0.5, "npc"), 
    y = unit(0.5, "npc"),
    width = unit(.1, "npc"), 
    img = NULL,
    unit_count = 1,
    max_unit_count = 10,
    spacing = 0.1) {

  aspect_ratio <- get_asp(img)
  width <- 1 / aspect_ratio / max_unit_count 
  height <- 1 / max_unit_count

  for (i in 1:ceiling(unit_count)) {
    unit_y <- y + (i - 1) * height
    unit_height <- height * (1 - spacing) * min(1, unit_count - i + 1)
    tg <- texture_grob(
      unit(x, "npc"), 
      unit(unit_y, "npc"), 
      unit(width, "npc"), 
      unit(unit_height, "npc"), 
      img)
    grid.draw(tg)
  }
}

img1 <- magick::image_read("http://steveharoz.com/research/isotype/icons/giraffe.svg")
img2 <- magick::image_read("http://steveharoz.com/research/isotype/icons/elephant.svg")

grid.newpage()
draw_vertical_stack_of_images(x=.2, y=.05, width=.1, img=img1, unit_count=5.5, max_unit_count=10)
draw_vertical_stack_of_images(x=.5, y=.05, width=.1, img=img2, unit_count=2.2, max_unit_count=10)

image

clauswilke commented 6 years ago

I think dependency on the plot aspect ratio is unavoidable. Similar issues arise with other packages, e.g. ggrepel. The label location changes every time you change the plot size.

For the problem of different tiling images with different aspect ratios, my perspective is that the underlying drawing infrastructure should do general-purpose tiling. Then, any desired effect can be obtained with that infrastructure by providing appropriately formatted and/or padded images. The padding could even be done in R with the magick package.

steveharoz commented 6 years ago

Do you know how to get the plot aspect ratio? Is there a get_panel_size_in_pixels() function in grid? If so, you could compensate for the image aspect ratio. The reason why I bring it up is that in most unit charts, a unit has a specific scale, like 1 elephant icon = 10,000 elephants. If the number of items changed with aspect ratio, it could become misleading.

clauswilke commented 6 years ago

I think the right way to solve this problem is to make it possible to use the tiling images at a specified height or width and then ideally tie that to the scale in ggplot, so that e.g. one image height corresponds to x units on the scale regardless of the plot size or aspect ratio. I think that's possible, but I won't be able to look into this further until later this fall.

clauswilke commented 6 years ago

I've got it figured out: https://github.com/clauswilke/ggtextures (Legends aren't implemented yet.)

library(ggplot2)
library(tibble)
library(ggtextures)

data <- tibble(
  count = c(5, 3, 6),
  animal = c("giraffe", "elephant", "horse"),
  image = list(
    "http://steveharoz.com/research/isotype/icons/giraffe.svg",
    "http://steveharoz.com/research/isotype/icons/elephant.svg",
    "http://steveharoz.com/research/isotype/icons/horse.svg"
  )
)

ggplot(data, aes(animal, count, image = image)) +
  geom_isotype_col() +
  theme_minimal()


ggplot(data, aes(animal, count, image = image)) +
  geom_isotype_col(
    img_width = grid::unit(1, "native"), img_height = NULL,
    ncol = NA, nrow = 1,
    hjust = 0, vjust = 0.5
  ) +
  coord_flip() +
  theme_minimal()

Created on 2018-09-27 by the reprex package (v0.2.0).