futureverse / future.apply

:rocket: R package: future.apply - Apply Function to Elements in Parallel using Futures
https://future.apply.futureverse.org
211 stars 16 forks source link

Rare error in next_random_seed() #122

Closed mb706 closed 1 week ago

mb706 commented 1 month ago

In very rare circumstances, next_random_seed() in future_xapply() fails:

set.seed(5313910L)
runif(623) -> .
future.apply::future_mapply(function(iter) iter, iter = 1:10, future.seed = TRUE)
#> Error: ‘!any(seed_next != seed)’ is not TRUE
traceback()
#> 6: stop(cond) at utils,conditions.R#12
#> 5: stopf("%s is not TRUE", sQuote(call), call. = FALSE, domain = NA) at utils.R#17
#> 4: stop_if_not(!any(seed_next != seed)) at rng_utils.R#61
#> 3: next_random_seed() at future_xapply.R#46
#> 2: future_xapply(FUN = FUN, nX = nX, chunk_args = dots, MoreArgs = MoreArgs,
#>        get_chunk = function(X, chunk) lapply(X, FUN = `chunkWith[[`,
#>            chunk), expr = expr, envir = envir, future.envir = future.envir,
#>        future.globals = future.globals, future.packages = future.packages,
#>        future.scheduling = future.scheduling, future.chunk.size = future.chunk.size,
#>        future.stdout = future.stdout, future.conditions = future.conditions,
#>        future.seed = future.seed, future.label = future.label, fcn_name = fcn_name,
#>        args_name = args_name, debug = debug) at future_mapply.R#137
#> 1: future.apply::future_mapply(function(iter) iter, iter = 1:10,
#>        future.seed = TRUE)

This seems to happen because .Random.seed gains an NA value...

set.seed(5313910L)
runif(623) -> .
any(is.na(.Random.seed))
#> [1] FALSE
runif(2)
#> [1] 0.1667102 0.6395424
any(is.na(.Random.seed))
#> [1] TRUE
sessionInfo()
#> R version 4.4.0 (2024-04-24)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Red Hat Enterprise Linux 8.9 (Ootpa)
#>
#> Matrix products: default
#> BLAS/LAPACK: /apps/u/spack/gcc/12.2.0/openblas/0.3.21-6supzcg/lib/libopenblas-r0.3.21.so;  LAPACK version 3.9.0
#>
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
#>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
#>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
#>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: America/Denver
#> tzcode source: internal
#>
#> attached base packages:
#> [1] stats     graphics  grDevices datasets  utils     methods   base
#>
#> loaded via a namespace (and not attached):
#>  [1] compiler_4.4.0      parallelly_1.38.0   parallel_4.4.0
#>  [4] tools_4.4.0         listenv_0.9.1       future.apply_1.11.2
#>  [7] codetools_0.2-19    data.table_1.16.0   digest_0.6.37
#> [10] globals_0.16.3      RhpcBLASctl_0.23-42 renv_1.0.7
#> [13] future_1.34.0
#>
RNGkind()
#> [1] "Mersenne-Twister" "Inversion"        "Rejection"
mb706 commented 1 month ago

This apparently happens because the minimal integer value represents NA in R: https://stackoverflow.com/questions/56507748/internal-representation-of-int-na

(credit to @sebffischer for finding the reason here)

jemus42 commented 2 weeks ago

For completeness sake, this also affects future I think.

HenrikBengtsson commented 1 week ago

Thanks. I'm really impressed that you managed to track down a reproducible example!

This apparently happens because the minimal integer value represents NA in R

Yes, this was an oversight of mine, where I did not anticipate NA_integer_. Looking at help("Random", package = "base"), they only "alert" the reader that values can be negative;

"In the underlying C, .Random.seed[-1] is unsigned; therefore in R .Random.seed[-1] can be negative, due to the representation of an unsigned integer by a signed integer. "

but they forgot to mention the R_INT_MIN = -INT_MAX <=> NA_integer_ relationship. @mb706 would you mind reporting this to https://bugs.r-project.org/, or to R-devel?

HenrikBengtsson commented 1 week ago

Forgot to say, I've submitted future.apply 1.11.3 to CRAN that fixes this.