r-lib / testthat

An R 📦 to make testing 😀
https://testthat.r-lib.org
Other
871 stars 314 forks source link

Support compressed svg in snapshot tests #1732

Open gavinsimpson opened 1 year ago

gavinsimpson commented 1 year ago

In relation to r-lib/vdiffr#126 could testthat support compressed svgs produced by {svglite}? It seems that it should be pretty simple to decompress the svg in testthat::compare_file_text using gzfile() of gzcon(), so {brio} doesn't need to support gz compressed files natively. I haven't looked what might be required in the shiny app to review snapshots.

The main use case for this is massively reducing the filesize of SVG snapshots produced by {vdiffr}. This is causing issues with the size of the tarball produced by R CMD build, which are being flagged by CRAN's incoming checks.

Why do we have such large SVGs? Checking these images produced from statistical models seems very sensitive to small differences in the fitted model due to OS differences etc, that go away with larger data. But then plotting outputs from these models often involves plotting data, which inflates the sizes of resulting plots. Regardless, reducing the file size of SVG snapshots would be good in terms of CRAN's policy on retention of package sources.

If you agree this is useful, would it be best to limit such a change to supporting only compressed svg or allowing tests of compressed text files more generally? Only gz compression or others? I'd happily produce a PR implementing the above if there is interest.

hadley commented 1 year ago

I think the place to start would be to prototype a custom compare function that you can pass to expect_snapshot_file(). If that's sufficient, we can figure out how to include in vdiffr. If it's not enough, it'll suggest what other changes are needed in testthat.

gavinsimpson commented 3 months ago

Returning to this, here is a pair of functions implementing the compare function @hadley suggested I start with

library("svglite")

# functions for testthat compare
read_svg <- function(f) {
  f <- if (identical(tools::file_ext(f), "svgz")) {
    fc <- gzfile(f)
    on.exit(close(fc))
    base::readLines(fc)
  } else {
    base::readLines(f)
  }
  f
}
compare_file_text_svg <- function(old, new) {
  old <- read_svg(old)
  new <- read_svg(new)
  identical(old, new)
}

# helper to do a plot and store it in temp file as a svg
save_svg <- function(code, compress = FALSE, width = 10, height = 8, ...) {
  extn <- ifelse(compress, ".svgz", ".svg")
  path <- tempfile(fileext = extn)
  svglite::svglite(path, width = width, height = height, ...)
  on.exit(dev.off())
  code

  path
}

# generate some plots
p1 <- save_svg(plot(1:10), compress = TRUE)
p2 <- save_svg(plot(1:5), compress = TRUE)
p3 <- save_svg(plot(1:10), compress = FALSE)

# compare them
compare_file_text_svg(p1, p1) # want this to be TRUE
#> [1] TRUE
compare_file_text_svg(p1, p2) # this should be FALSE
#> [1] FALSE
compare_file_text_svg(p1, p3) # want this to be TRUE
#> [1] TRUE

Created on 2024-03-27 with reprex v2.1.0

brio::read_lines() doesn't handle connections so I switched to using base::readLines() for the compressed version. I'm not sure if this matters, or whether it is safer to use