Testing infrastructure - Githubissues

ijlyttle commented 4 years ago

Over the past few days, I have been getting my thoughts together for this project (trying to remember what I had forgotten since UseR!)

I am starting to grasp what needs to be done in terms of documentation, but I think the immediate need is for a testing infrastructure.

I see three levels of tests:

End-to-end translation of a ggplot object to a ggspec to a Vega-Lite spec.
Visual confirmation that our ideas for end-to-end translation make visual sense
Unit tests for subcomponents of the translations.

I plan to work on 1. and 2. in the very short-term. This will include some reference ggplot2 plots for geom_point() and geom_bar() (or maybe geom_col()), which will let Wenyu start to extend things at his end.

For item 3., this is something that Haley has already implemented for "ggspec", but that we will plan to work with Wenyu on in "ggvega" (as I try to get better with TS testing).

Here's my immediate plan for a PR:

[x] move inst/exploratory to dev/exploratory, and add the dev directory to .Rbuildignore.
[x] create a directory inst/examples/ggplot2 where there would a file for each example, e.g.:
- scatterplot-iris.R containing the source code to create the ggplot object
[x] incorporate a bar-chart example

I will build some helper functions to make it easier to access the source code to compile the objects themselves, something like:

# return the path to the example file
gg_example_file("scatterplot-iris") # for external use
gg_example_file_dev("scatterplot-iris") # for internal use 

# return a ggplot object
gg_example("scatterplot-iris")
gg_example_dev("scatterplot-iris")

Once all this is done, I plan to merge the PR so that the examples can be used right away.

In the coming days, I plan to build some testing infrastructure that will work for entire specs (as opposed to components of specs).

ijlyttle commented 4 years ago

Thinking of adjusting the example directory (of course there will be more examples than iris-scatterplot):

examples/
  full/
    ggplot2/
      iris-scatterplot.R
    ggspec/
      iris-scatterplot.gs.json
    vega-lite/
      iris-scatterplot.vl.json
  denatured/
    ggplot2/
      iris-scatterplot.R
    ggspec/
      iris-scatterplot.gs.json
    vega-lite/
      iris-scatterplot.vl.json

Here's how I see these directories being built:

full/ggplot2/: we build these by hand, writing ggplot code in R

denatured/ggplot2/: these are derived from full/ggplot2/, but we remove swap each dataset, x with head(x, 1), so that we keep only the first row. This process is automated.

full/ggspec/: these are derived automatically from full/ggplot2/, using gg2spec()

full/vega-lite/: these are derived automatically from full/ggspec/, using spec2vl()

denatured/ggspec/: these are built by hand, according to how we expect gg2spec() to behave.

denatured/vega-lite/: these are built by hand, according to how we expect spec2vl() to behave.

Here's how I see these directories being used:

The full directories would be used in our gallery, which would also serve as a visual-regression, allowing us to make sure that the end results act as we expect, and to give us some insights into ggspec.

The denatured directories would be used for top-level testing, and could also be used for component testing. For example, I would set up tests that would verify that when I denature the data in the specs in full/ggspec, these are identical to the specs in denatured/ggspec. Similarly, I would set up tests that would verify that when I denature the data in the specs in full/vega-lite, these are identical to the specs in denatured/vega-lite.

Suggestions/discussion are welcome: I will work to make this framework available by Monday.

ijlyttle commented 4 years ago

I had another think about this - the overall idea is the same, but I am considering an different execution:

The directory where we compose all examples will be: data-raw/examples; this will contain the Rmd script to deploy the examples elsewhere, as well as a ggplot2 directory where we will compose all of our ggplot2 examples:

data-raw/examples/
  deploy.Rmd
  ggplot2/
    scatterplot-iris.R
    ...

There will be two locations where we deploy examples, this will be automated using the deploy.Rmd file:

inst/examples/ggplot2, which we will use as the basis of our gallery and visual-regression (validating that our translations behave as we expect)
tests/testthhat/examples/ggplot2, same plots as source, but using abridged datasets (this will be automated)

tests/testthat/examples/
  ggplot2/
    scatterplot-iris.R          # using only first row of each dataset
    ...
  ggspec/
    scatterplot-iris.gs.json    # built by hand
    ...
  vega-lite/
    scatterplot-iris.vl.json    # built by hand
    ...

Our tests would verify that:

running gg2spec() on the ggplot object gives us an equivalent object to the gs.json file
running spec2vl() on the gs.json file gives us an equivalent object to the vl.json file

I will have to define equivalent to take into account that order matters for JSON arrays [], but does not matter for JSON objects {}. In R, this means that order matters for vectors and for unnamed lists, but does not matter for named lists.

One other note:

It will be good to verify that each of the deployed examples is represented in our visual-regression (validation) gallery

tdhock commented 4 years ago

in tdhock/animint2 we use RSelenium + phantomjs on Travis for testing, which works very well in my experience.

ijlyttle commented 4 years ago

I think we want to get to something that will test rendered images automatically.

One thing I worry about is that there would be a change in the Vega renderer, or the png (or svg) renderer that would produce a false positive. I would think we would need someway to handle this situation. {shinytest} has some tools to handle this, but I think it is reserved to Shiny.

vegawidget / ggvega

Testing infrastructure #24