r-lib / vdiffr

Visual regression testing and graphical diffing with testthat
https://vdiffr.r-lib.org
Other
185 stars 31 forks source link

Add option to skip all tests or not have snapshots cause failures on CI? #143

Closed gavinsimpson closed 4 months ago

gavinsimpson commented 4 months ago

The README for {vdiffr} states

There may be many reasons for a snapshot to fail. Upstream changes (e.g. to the R graphics engine or to ggplot2) may cause subtle differences in your plots that are not actual failures. For this reason, snapshots do not cause failures on CRAN by default. You will only see failures locally or on CI platforms such as Github Actions.

and

The plot depends on some system library. For instance sf plots depend on libraries like GEOS and GDAL. It might not be possible to test these plots with vdiffr (which can still be used for manual inspection, add a [testthat::skip()] before the expect_doppelganger() call in that case).

I see regular (usually non visible) differences across OSes in the plots produced by my {gratia} package that are due to OS differences in the linear algebra stack in use. As the main raison d'etre of {gratia} is to provide ggplot-based plots of GAM fits from {mgcv} and related packages it is a pain to manually intervene on each failing test by skipping on MacOS X and Windows where the plots deviated from the references generated on my Linux box. Now I have literally hundreds (ok, perhaps getting on for a hundred) of these skips_on_os() across my test suite which means I'm playing a constant game of whack-a-mole with trivial non-visible failures on GH Actions when R versions change, or the runners on GH Actions change, or the system libraries on the runners change...

...and I have now switched my development platform from Linux to MacOS X, so I have to undo all those changes.

In the same way as there is an option to skip these visual checks on CRAN (or have them not cause failures), would it be possible to have an option that skips all visual tests on CI (or more specifically say on a certain CI, like GH Actions) or skips all tests on CI on a particular OS (or not have snapshots fail)?

In my use-case, I would like to make visual snapshos not cause failures on GH Actions for Windows and Linux (Ubuntu) as I will do a one-time update of all snapshots on my new MacOS X machine.

If it's not desirable to add this as a feature to {vdiffr}, is there a way to identify if the system is running on a CI like GH Actions that I can use in the wrapper around expect_doppelganger() that is now suggested we use in our tests, such that it returns TRUE on some combinations of OSes and GH Actions?

gavinsimpson commented 4 months ago

Sorry for the noise; I see #141 is very much related to my problem and I should be able to make the changes I need using that env var and the R example in the follow-up entry in the thread.