Error when using parallel::parLapply with devtools::test

agranholm commented 1 year ago

Hello. I am having trouble with using devtools::test() in combination with the parallel-package; it seems to be a namespace issue, as the error seems caused by the previous version of a function being loaded on the parallel-clusters instead of the development version otherwise loaded by devtools when testing without parallelization. I've found an old Stackoverflow post discussing this: https://stackoverflow.com/questions/25605819/how-to-use-the-parallel-package-inside-another-package-using-devtools However, it is >8 years old and the solution does not seem to work anymore, so any help would be appreciated. I have not been able to find a previous issue that deals with the exact same problem with a working solution. Given that load_all is also used in the example, let me know if this is a better fit for the pkgload (or testthat) repo or somewhere else.

In brief, we have an R package on CRAN, and are updating some of the functions. One updated function differs somewhat in the output returned than in the previous version. This function is called multiple times by another function, with the option to do this in parallel using the parallel library. This causes an error with devtools::test, and - as stated above - this seems to be due to the previous installed version loaded when using parallel::parLapply instead of lapply. Here's how to reproduce the problem, simplified as much as possible. Would be happy to provide additional details. Any help on how to solve this would be appreciated.

1) Create a new package. Add the following code to an R file:

int_m2 <- function(x) base::mean(x) # * 2 # Commented out for now

m2 <- function(x) int_m2(x)

This is one user-facing function (m2) calling another function that is supposed to be internal (int_m2).

2) Install using devtools::install() and restart R session afterwards.

3) Update the previous R functions to this (now both m2 and int_m2 will return values multiplied by 2, compared to before):

int_m2 <- function(x) base::mean(x) * 2

m2 <- function(x) int_m2(x)

4) Create the following test file using devtools::use_test:

test_that("it works", {
  cl <- parallel::makeCluster(2)
  on.exit(parallel::stopCluster(cl), add = TRUE, after = FALSE)

  x <- list(1:10, 2:20, 3:30)

  expect_equal(tolerance = 0.00001,
               lapply(x, FUN = m2),
               parallel::parLapply(X = x, fun = m2, cl = cl)
               )
})

5) Load devtools, call load_all() and manually execute the code within the test_that call above, i.e.:

library(devtools)
load_all()

  cl <- parallel::makeCluster(2)
  on.exit(parallel::stopCluster(cl), add = TRUE, after = FALSE)

  x <- list(1:10, 2:20, 3:30)

  expect_equal(tolerance = 0.00001,
               lapply(x, FUN = m2),
               parallel::parLapply(X = x, fun = m2, cl = cl)
               )

This gives an error. Running devtools::test() similarly gives an error, as outputs returned by parallel::parLapply are not multiplied by two.

6) The suggestion in the Stackoverflow post above, i.e., adding parallel::clusterEvalQ(cl, devtools::dev_mode()) before testing, does not seem to solve it. Restart R session and run:

library(devtools)
load_all()

  cl <- parallel::makeCluster(2)
  on.exit(parallel::stopCluster(cl), add = TRUE, after = FALSE)
  parallel::clusterEvalQ(cl, devtools::dev_mode())

  x <- list(1:10, 2:20, 3:30)

  expect_equal(tolerance = 0.00001,
               lapply(x, FUN = m2),
               parallel::parLapply(X = x, fun = m2, cl = cl)
               )

7) I've also tried adding parallel::clusterExport(cl, "int_m2"), but this does not seem to solve it if the package is installed. If, however, I remove the previously installed package, it works when I run this interactively:

library(devtools)
load_all()

  cl <- parallel::makeCluster(2)
  on.exit(parallel::stopCluster(cl), add = TRUE, after = FALSE)
  parallel::clusterExport(cl, "int_m2")

  x <- list(1:10, 2:20, 3:30)

  expect_equal(tolerance = 0.00001,
               lapply(x, FUN = m2),
               parallel::parLapply(X = x, fun = m2, cl = cl)
               )

Similarly, it works if I restart the session and run:

library(devtools)
load_all()
test()

However, if I repeat steps 1+2 and re-install and re-run the test code in step 7, this solution does not work either.

Any help with a solution to this issue would be much appreciated, as would guidance on where else this issue would be suited if not fitting here.

jennybc commented 1 year ago

I think you already basically get this, but the overall problem seems to be that it's hard or impossible to get parallel::parLapply() to use the value of m2() implied by the current source of your package, as opposed to using the officially installed version.

I note that devtools::check() does pass, because it really does install your package, thereby harmonizing everyone's notion of what m2() does.

I have similar challenges with reprex, because the reprex_document() output format used in tests always comes from the genuinely installed version, not from current source, because it's getting used by another package (rmarkdown). So, when working on certain areas of reprex, I really do have to install the relevant version of reprex before running tests. And devtools::check() does that.

This is probably the real answer to your question, i.e. the most honest and foolproof answer. It's a bummer, because it makes test development a bit more clunky, but certain packages just are that way. (I will think about it more, in the background. I also have zero expertise with the parallel package, so I am not the person to ask if there's some feature you could be taking better advantage of. )

jennybc commented 1 year ago

I don't think that parallel::clusterExport() can ever really be the answer, because it feels like, in theory, you'd need to export every single function called directly or indirectly by the function you want to test (m2() in the example).

jennybc commented 1 year ago

Past history here, related to the SO thread, #571.

I also doubt that dev_mode() is the right solution, because dev_mode() feels basically superseded at this point (I will open an issue about marking it as such).

jennybc commented 1 year ago

I also note that changing the type of the cluster seems to help. This passes for me with test_active_file() and with test() (but not when executing the test innards line-by-line).

test_that("it works", {
  cl <- parallel::makeCluster(2, type = "FORK")
  withr::defer(parallel::stopCluster(cl))

  x <- list(1:10, 2:20, 3:30)

  expect_equal(tolerance = 0.00001,
               lapply(x, FUN = m2),
               parallel::parLapply(X = x, fun = m2, cl = cl)
               )
})
#> Test passed 😸

agranholm commented 1 year ago

Hi Jenny. Thanks for the detailed answer. I suspected there wasn't an easy workaround, but this is helpful, especially the elaboration on the difference in behavior regarding testing/installation between devtools::check() and devtools::test(). A suggestion could be briefly mention this in the function documentation; at least, it would have been helpful and would have saved some time in this case.

Of note, while using type = "FORK" in the parallel::makeCluster() call solves it in this case, it is not an option in the actual use case - we do cross-platform testing, and FORK'ing is not possible on Windows, and in the actual case, the cluster setup and dispatch is handled by a separate function and not directly specified in the test_that call.

The best solution I can come up with for now for handling this (if others have the same issue), is to use utils::packageVersion() on both the cluster workers (called using parallel::clusterCall()) and the main process. If the package version returned from the cluster workers is lower than that from the main process, then the particular test is skipped. Seems to be a solution that at least ensures that the test is run on relevant 'final' testing when using devtools::check(), but skipped when using devtools::test() where it would otherwise lead to a failed test.

jennybc commented 1 year ago

The best solution I can come up with for now for handling this ... is to use utils::packageVersion() on both the cluster workers (called using parallel::clusterCall()) and the main process. If the package version returned from the cluster workers is lower than that from the main process, then the particular test is skipped.

Based on my experience of this problem (which is mostly via reprex, as described above), I think this might be over-engineered or impractical. For example, unless you are constantly incrementing version number, this isn't even going to catch the mismatch we explored above. I think, in most cases where you actually care about this, utils::packageVersion() will be the same in the main process and the workers (both at some bleeding edge dev version). At least, that would be true for my workflow.

This function mismatch problem will only come up when you're tinkering with m2() (using your example as reference) and you have tests that compare main process to workers and you're running tests interactively through a shortcut like test(). So I would probably just leave myself a reminder that, if these parallel-involving tests of m2() are failing, do a quick install(), then test again. Or remember that the true test is what you see from check() or on CI. This is a problem only for your interactive workflow, not "in production".

agranholm commented 1 year ago

It works quite well in our workflow (we mostly only fully install the package when doing the very final checks prior to CRAN submission and replace it with the CRAN version afterwards), but agree that it isn't a universal solution - and given that it's a few lines of extra code in a single place wrapping the multicore tests, having the option to test() everything else and skipping this works here. But fully agree that it is not a universal solution; just wanted to mention the workaround here (given that there is not a fully working solution for test() unless installing) in case others run into the same.

Thanks for the help again - appreciate the clarification regarding testing in test() and check(); feel free to close this issue. From my point of view, a brief mention of the potential problem (and solutions) in the relevant function documentation (e.g., devtools::test() and friends) in cases where parallelization is used or where the functions are called from another package would've been helpful, so maybe worth mentioning. Thanks again!

r-lib / devtools

Error when using parallel::parLapply with devtools::test #2489