ropensci / drake

An R-focused pipeline toolkit for reproducibility and high-performance computing
https://docs.ropensci.org/drake
GNU General Public License v3.0
1.34k stars 128 forks source link

drake/future/multisession complains about unreliable random seed but values are ok #1333

Closed kkmann closed 3 years ago

kkmann commented 3 years ago

Prework

Description

Evaluating drake plan using multisession futures on gives warning about random seed. The values seem to be ok though.

Reproducible example

> drake::clean()
> library(drake)
> 
> i <- 1:10
> plan <- drake::drake_plan(
+     test = target({
+         runif(1)
+     }, dynamic = map(i))
+ )
> 
> future::plan(future::multisession)
> drake::make(plan, seed = 42, parallelism = "future")
▶ dynamic test
> subtarget test_0b3474bd
> subtarget test_b2a5c9b8
> subtarget test_71f311ad
> subtarget test_98cf3c11
> subtarget test_0a86c9cb
> subtarget test_cb15b01f
> subtarget test_8531e6ff
> subtarget test_28b16d75
> subtarget test_4edaada2
> subtarget test_db4b2027
■ finalize test
Warning message:
UNRELIABLE VALUE: Future (‘test_0b3474bd’) unexpectedly generated random numbers without specifying argument '[future.]seed'. There is a risk that those random numbers are not statistically sound and the overall results might be invalid. To fix this, specify argument '[future.]seed', e.g. 'seed=TRUE'. This ensures that proper, parallel-safe random numbers are produced via the L'Ecuyer-CMRG method. To disable this check, use [future].seed=NULL, or set option 'future.rng.onMisuse' to "ignore". 
> tmp1 <- unlist(readd(test))
> drake::clean()
> drake::make(plan, seed = 42, parallelism = "future")
▶ dynamic test
> subtarget test_0b3474bd
> subtarget test_b2a5c9b8
> subtarget test_71f311ad
> subtarget test_98cf3c11
> subtarget test_0a86c9cb
> subtarget test_cb15b01f
> subtarget test_8531e6ff
> subtarget test_28b16d75
> subtarget test_4edaada2
> subtarget test_db4b2027
■ finalize test
> tmp2 <- unlist(readd(test))
> 
> tmp1 - tmp2
 [1] 0 0 0 0 0 0 0 0 0 0

Expected result

If this is no problem, the warning should be suppressed without user input.

Session info

End the reproducible example with a call to sessionInfo() in the same session (e.g. reprex(si = TRUE)) and include the output.

> sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-apple-darwin19.5.0 (64-bit)
Running under: macOS Catalina 10.15.6

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /usr/local/Cellar/openblas/0.3.10_1/lib/libopenblasp-r0.3.10.dylib

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] drake_7.12.5

loaded via a namespace (and not attached):
 [1] igraph_1.2.5      rstudioapi_0.11   magrittr_1.5      hms_0.5.3         progress_1.2.2    tidyselect_1.1.0 
 [7] R6_2.4.1          rlang_0.4.7       fansi_0.4.1       storr_1.2.1       globals_0.13.1    tools_4.0.2      
[13] parallel_4.0.2    cli_2.0.2         ellipsis_0.3.1    base64url_1.4     digest_0.6.25     assertthat_0.2.1 
[19] tibble_3.0.3      lifecycle_0.2.0   crayon_1.3.4      txtq_0.2.3        purrr_0.3.4       codetools_0.2-16 
[25] vctrs_0.3.2       glue_1.4.1        pillar_1.4.6      compiler_4.0.2    filelock_1.0.2    backports_1.1.10 
[31] prettyunits_1.1.1 future_1.19.1     listenv_0.8.0     pkgconfig_2.0.3  
wlandau commented 3 years ago

I see you are using drake 7.12.5. These warnings were fixed in https://github.com/ropensci/drake/commit/de861f86565faed70ba72cb487703eb662cb0e72 and the patch is in 7.12.6 (current CRAN).