haozhu233 / kableExtra

Construct Complex Table with knitr::kable() + pipe.
https://haozhu233.github.io/kableExtra/
Other
689 stars 147 forks source link

General spec_plotfun function #503

Open haozhu233 opened 4 years ago

haozhu233 commented 4 years ago

@r2evans made a great suggestion that we should create a general spec_plot function that allows users to feed in customized plotting functions. Here is the issue to track the progress and manage the discussions on this topic.

r2evans commented 4 years ago

Sorry, I probably made the PR a bit more detailed than it needed to be. I'll copy the examples from there so that we can discuss it here.

Using the spec_plot as defined in #505 and this code:

mpg_list <- split(mtcars$mpg, mtcars$cyl)
inline_plot <- data.frame(cyl = c(4, 6, 8), mpg_box = "", mpg_hist = "",
                          spec_line = "", small_lines = "", small_lines_gg = "")
inline_plot %>% 
  kbl(booktabs = T) %>%
  kable_paper(full_width = F) %>%
  column_spec(2, image = spec_boxplot(mpg_list)) %>%
  column_spec(3, image = spec_hist(mpg_list)) %>%
  column_spec(4, image = spec_line(mpg_list, same_lim = FALSE)) %>%
  column_spec(5, image = spec_plot(small_lines(), mpg_list)) %>%
  column_spec(6, image = spec_plot(small_lines_gg(), mpg_list))

we can generate

image

image

and take it to a bit more complicated with:

set.seed(42)
x <- cumsum(rnorm(20))

somefunc <- function(z) {
  dat <- data.frame(
    x = seq_along(z),
    y = z,
    mu = zoo::rollapply(z, 9, mean, partial = TRUE),
    sigma = 12*zoo::rollapply(z, 9, sd, partial = TRUE)
  )
  graphics::par(mar = c(0, 0, 0.2, 0), lwd=0.5)
  plot(NA, type = "n", xaxt = "n", yaxt = "n", ann = FALSE, frame.plot = FALSE,
       xlim = c(1, length(z)), ylim = range(z) + c(-1,1)*max(dat$sigma))
  polygon(c(dat$x, rev(dat$x), dat$x[1]),
          c(dat$mu + dat$sigma, rev(dat$mu - dat$sigma), dat$mu[1] + dat$sigma[1]),
          col = "gray80", lwd = 0.1)
  lines(mu~x, data=dat, col = "red", lwd = 0.2)
}

data.frame(a="quux", b="") %>%
  kbl(booktabs = T) %>%
  kable_paper(full_width = F) %>%
  column_spec(2, image = spec_plot(somefunc, list(x)))

image

r2evans commented 4 years ago

Some concerns I have, looking for your comments:

  1. the technique that allows spec_plot to deal with both base graphics and ggplot2 graphics is that if there's a return object from the UDF that inherits "ggplot", then we discard the already-attempted grDevices::svg-saved image (or png), because ggplot2 performs inconsistently (at least with svg). To get it to work, I get the ggplot2::ggsave function with

    ggsave <- get("ggsave", envir=as.environment("package:ggplot2"))

    so that we don't necessarily trigger a CRAN-complaint about not Importing or Depends on ggplot2. The reason I don't go so far as to requireNamespace("ggplot2") is that if we are collecting a graphic of class "ggplot", I find it highly unlikely that we need to try hard to load the namespace and fail if not found. While I can contrive of an example where a system returns a class "ggplot"-object without have ggplot2-package available, it is ... obscure.

  2. I added ggplot2 to Suggests:, since this package will be able to take advantage of it if present. Your other option is to add ggplot2 to Imports: and we can access ggsave with the standard double-colon notation without CRAN complaining ... but I was not assuming that you wanted to add that otherwise-optional package as an installation prerequisite.

  3. I knowingly run the inefficiency of calling svg(); ...; dev.off() and then overwriting with ggsave, since we don't know until the plot function is complete whether we can infer it is ggplot-like. Since this is not likely to surprise the user, we could certainly require spec_plot(func, mpg_list, isggplot=TRUE) or similar, and remove the double-tap on ::svg(). I don't think it's a problem, frankly, other than a little extra processing.

  4. I think I am handling ggsave calls correctly. I am not proficient at all of the units= and such that it infers/requires, so I found a combination that works. Oddly, I found I must use device=grDevices::svg but cannot use device=grDevices::png, failing for width/height problems. There might be a better way to handle this in the if (inherits(thisplot, "ggplot")) block.

  5. I added the use of on.exit(dev.off(curdev)) (also in spec_line) because during testing, I often interrupted things and had a stale graphics device sitting around. Calling dev.off(curdev) multiple times (on the same dev) is harmless and repeats are a no-op. For consistency, it might be good to add this to spec_boxplot and spec_hist as well.

And some discussion points on a separate vignette on UDFs:

harrelfe commented 1 year ago

I tried running the earlier somefunc code (but using pipe |>) and got Error in min(x, na.rm = na.rm) : invalid 'type' (list) of argument. Is this general "somefunc" approach implemented?

r2evans commented 1 year ago

@harrelfe I don't think so, unfortunately. I think all (most, at least) of my use-cases are currently covered by the other spec_* functions, so I slowed development on it and moved on. What's your use-case where you'd prefer to use a UDF for plotting?