Address `check()` vs `test()` failure modes

njtierney commented 2 years ago

Should resolve #500

njtierney commented 2 years ago

These changes are a bit tricky to test, since they aren't replicated on GH actions, but instead locally.

One clue I have so far is tests failing due to other settings being turned on. What I mean is there are tests in test_inference and friends:

https://github.com/greta-dev/greta/blob/master/tests/testthat/test_inference.R#L505-L541

Where some part of the future plan is set, and then turned off at the end.

However, these errors are then appearing in places they shouldn't, and when running build() then R CMD check locally, I am getting errors like this:

── Error (test_greta_mcmc_list_class.R:36:3): window works ─────────────────────
Error: parallel mcmc samplers cannot be run with `plan(multiprocess)` or `plan(multicore)`
Backtrace:
    █
 1. └─greta::mcmc(m, warmup = 100, verbose = FALSE) test_greta_mcmc_list_class.R:36:2
 2.   └─greta:::run_samplers(...)
 3.     └─greta:::check_future_plan()

Which is super weird since those should only happen when the plan is tinkered with.

It's possible that some of these tests aren't cleaning up after themselves properly, and we should be using on.exit, or better yet, withr::defer instead, as described here:

https://www.tidyverse.org/blog/2020/04/self-cleaning-test-fixtures/

njtierney commented 2 years ago

OK so locally, doing:

R CMD build greta then R CMD check --as-cran greta_0.4.0.tar.gz gave me no errors.

However, doing devtools::check() returned similar errors as above - well, 248 errors. 2 fewer errors since using withr::defer().

This makes me wonder if there is something happening with each test_that() session not appropriately cleaning up and the python environment for greta not being able to be loaded or something?

njtierney commented 2 years ago

This pull request was largely resolved by the fact that the tests that mock python installation, e.g.,

test_that("check_tf_version errors when have_python, _tf, or _tfp is FALSE", {
    mockery::stub(check_tf_version, 'have_python', FALSE)
    mockery::stub(check_tf_version, 'have_tf', FALSE)
    mockery::stub(check_tf_version, 'have_tfp', FALSE)

    expect_snapshot_error(
      check_tf_version("error")
      )

    expect_snapshot_warning(
      check_tf_version("warn")
    )

    expect_snapshot(
      check_tf_version("message")
    )

})

These tests from mockery::stub were not behaving properly, and were actually leaking into other test environments. Strangely, this was only happening when running devtools::check() locally. It did not happen when running devtools::test() or on GitHub Actions CI. I have no idea why.

Along the journey for this, I ended up doing the following things:

Added skip_if_not(check_tf_version)) where appropriate
Used withr::defer and withr::local_envvar to more precisely control how future plan and environment variables are set and cleaned up upon exit
Removed unused and unstable test for tf$reshape
Updated and removed extra snapshot tests
Removed all uses of library(pkg) in tests, using namespaced pkg::fun instead.
Removed all uses of source("helpers.R") in tests
Move some functions from tests/testthat/helpers.R into R/testthat-helpers
Updated DESCRIPTION date
Removed extra printing inside expect_snapshot_error()
Expect a warning for chol2inv, not an error
Moved definitions of extraDistr::fun inside function compare_truncated_distribution()
Increased number of samples for compare_iid_samples()
Set verbose = FALSE for mcmc() when verbosity is not needed for the test
Removed mocking tests of python being installed (e.g., mockery::stub(check_tf_version, 'have_tfp', FALSE) and friends), which finally allowed this to pass R CMD check locally
Move most mockery tests into test-zzzz.R ... which unfortunately doesn't solve the mockery problem. These tests are now commented out.

(commits above were extracted using the below code, then tidied up)

library(tidyverse)
library(gert)
# not sure how to get the number of commits in a message
my_git_commit_logs <- git_log(max = 26)

my_git_commit_logs %>% 
  arrange(time) %>% 
  pull(message) %>% 
  paste0("* ", .) %>% 
  clipr::write_clip()

greta-dev / greta

Address `check()` vs `test()` failure modes #501