greta-dev / greta

simple and scalable statistical modelling in R
https://greta-stats.org
Other
531 stars 63 forks source link

Address `check()` vs `test()` failure modes #501

Closed njtierney closed 2 years ago

njtierney commented 2 years ago

Should resolve #500

njtierney commented 2 years ago

These changes are a bit tricky to test, since they aren't replicated on GH actions, but instead locally.

One clue I have so far is tests failing due to other settings being turned on. What I mean is there are tests in test_inference and friends:

https://github.com/greta-dev/greta/blob/master/tests/testthat/test_inference.R#L505-L541

Where some part of the future plan is set, and then turned off at the end.

However, these errors are then appearing in places they shouldn't, and when running build() then R CMD check locally, I am getting errors like this:

── Error (test_greta_mcmc_list_class.R:36:3): window works ─────────────────────
Error: parallel mcmc samplers cannot be run with `plan(multiprocess)` or `plan(multicore)`
Backtrace:
    █
 1. └─greta::mcmc(m, warmup = 100, verbose = FALSE) test_greta_mcmc_list_class.R:36:2
 2.   └─greta:::run_samplers(...)
 3.     └─greta:::check_future_plan()

Which is super weird since those should only happen when the plan is tinkered with.

It's possible that some of these tests aren't cleaning up after themselves properly, and we should be using on.exit, or better yet, withr::defer instead, as described here:

https://www.tidyverse.org/blog/2020/04/self-cleaning-test-fixtures/

njtierney commented 2 years ago

OK so locally, doing:

R CMD build greta then R CMD check --as-cran greta_0.4.0.tar.gz gave me no errors.

However, doing devtools::check() returned similar errors as above - well, 248 errors. 2 fewer errors since using withr::defer().

This makes me wonder if there is something happening with each test_that() session not appropriately cleaning up and the python environment for greta not being able to be loaded or something?

njtierney commented 2 years ago

This pull request was largely resolved by the fact that the tests that mock python installation, e.g.,

test_that("check_tf_version errors when have_python, _tf, or _tfp is FALSE", {
    mockery::stub(check_tf_version, 'have_python', FALSE)
    mockery::stub(check_tf_version, 'have_tf', FALSE)
    mockery::stub(check_tf_version, 'have_tfp', FALSE)

    expect_snapshot_error(
      check_tf_version("error")
      )

    expect_snapshot_warning(
      check_tf_version("warn")
    )

    expect_snapshot(
      check_tf_version("message")
    )

})

These tests from mockery::stub were not behaving properly, and were actually leaking into other test environments. Strangely, this was only happening when running devtools::check() locally. It did not happen when running devtools::test() or on GitHub Actions CI. I have no idea why.

Along the journey for this, I ended up doing the following things:

(commits above were extracted using the below code, then tidied up)

library(tidyverse)
library(gert)
# not sure how to get the number of commits in a message
my_git_commit_logs <- git_log(max = 26)

my_git_commit_logs %>% 
  arrange(time) %>% 
  pull(message) %>% 
  paste0("* ", .) %>% 
  clipr::write_clip()