I'm create this issue here to track cases of downstream packages that does not use doRNG::%dorng% for generating parallel-safe random numbers, or if they indeed do, the future framework still produces a warning about it.
[x] plyr - true positive Moved to #61
[x] BiocParallel - false positive (because it rolls its own L'Ecuyer-CMRG seeds internally)
Examples
Package 'plyr'
doFuture::registerDoFuture()
y <- plyr::llply(1:2, rnorm, .parallel = TRUE)
Warning messages:
1: In setup_parallel() : No parallel backend registered
2: UNRELIABLE VALUE: One of the foreach() iterations ('doFuture-1') unexpectedly generated random numbers without declaring so. There is a risk that those random numbers are not statistically sound and the overall results might be invalid. To fix this, use '%dorng%' from the 'doRNG' package instead of '%dopar%'. This ensures that proper, parallel-safe random numbers are produced via the L'Ecuyer-CMRG method. To disable this check, set option 'future.rng.onMisuse' to "ignore".
Conclusion: This is a true RNG mistake. This happens because plyr does not use doRNG. Also, looking at the source code, there are no other attempts to use parallel-safe RNG.
Workaround: AfterdoFuture::registerDoFuture(), call doRNG::registerDoRNG(), which will automatically turn all %dopar% to %dorng%, e.g.
doFuture::registerDoFuture()
doRNG::registerDoRNG()
y <- plyr::llply(1:2, rnorm, .parallel = TRUE)
doFuture::registerDoFuture()
BiocParallel::register(BiocParallel::DoparParam())
y <- BiocParallel::bplapply(1:2, rnorm)
Warning message:
UNRELIABLE VALUE: One of the foreach() iterations ('doFuture-1') unexpectedly generated random numbers without declaring so. There is a risk that those random numbers are not statistically sound and the overall results might be invalid. To fix this, use '%dorng%' from the 'doRNG' package instead of '%dopar%'. This ensures that proper, parallel-safe random numbers are produced via the L'Ecuyer-CMRG method. To disable this check, set option 'future.rng.onMisuse' to "ignore".
Conclusion: This is a false-positive because BiocParallel deploys L'Ecuyer-CMRG seeds internally. Although they're not invariant to the number of parallel workers(*), they're statistically sound. (*) There is work in progress for this, cf. https://github.com/Bioconductor/BiocParallel/pull/130.
However, if one wants to have parallel RNG that is invariant to the number of workers, call doRNG::registerDoRNG()afterdoFuture::registerDoFuture(), which will automatically turn all %dopar% to %dorng%, e.g.
I'm create this issue here to track cases of downstream packages that does not use
doRNG::%dorng%
for generating parallel-safe random numbers, or if they indeed do, the future framework still produces a warning about it.plyr - true positiveMoved to #61Examples
Package 'plyr'
Conclusion: This is a true RNG mistake. This happens because plyr does not use doRNG. Also, looking at the source code, there are no other attempts to use parallel-safe RNG.
Workaround: After
doFuture::registerDoFuture()
, calldoRNG::registerDoRNG()
, which will automatically turn all%dopar%
to%dorng%
, e.g.To disable the RNG warnings, set:
Package 'BiocParallel'
Conclusion: This is a false-positive because BiocParallel deploys L'Ecuyer-CMRG seeds internally. Although they're not invariant to the number of parallel workers(*), they're statistically sound. (*) There is work in progress for this, cf. https://github.com/Bioconductor/BiocParallel/pull/130.
To disable the false warnings, use:
Workaround: No workaround needed.
However, if one wants to have parallel RNG that is invariant to the number of workers, call
doRNG::registerDoRNG()
afterdoFuture::registerDoFuture()
, which will automatically turn all%dopar%
to%dorng%
, e.g.