RfastOfficial / Rfast

A collection of Rfast functions for data analysis. Note 1: The vast majority of the functions accept matrices only, not data.frames. Note 2: Do not have matrices or vectors with have missing data (i.e NAs). We do no check about them and C++ internally transforms them into zeros (0), so you may get wrong results. Note 3: In general, make sure you give the correct input, in order to get the correct output. We do no checks and this is one of the many reasons we are fast.
143 stars 19 forks source link

`Rfast` causes `future.apply` to fail #5

Closed wmacnair closed 2 years ago

wmacnair commented 4 years ago

Hey

I had an issue when calling future_lapply, and narrowed it down to Rfast. Here's a MWE:

library('future.apply')
# Loading required package: future
plan('multisession', workers=4)
# future_lapply works fine before library('Rfast')
test_l = future_lapply(1:32, function(x) sum(rnorm(1e6)))

# but after library('Rfast'), I get a 'node stack overflow' error
library('Rfast')
# Loading required package: Rcpp
# Loading required package: RcppZiggurat
test_l = future_lapply(1:32, function(x) sum(rnorm(1e6)))
# Error in tryCatchOne(expr, names, parentenv, handlers[[1L]]) :
#  node stack overflow

I can understand that two different approaches to parallelization might not play nicely together, but it seems like just importing a library should in principle not cause problems...

I like the idea of the package though! Should be very useful :)

Cheers Will

sessionInfo details:

sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.6 LTS

Matrix products: default
BLAS:   /usr/local/R/R-3.6.1/lib/libRblas.so
LAPACK: /usr/local/R/R-3.6.1/lib/libRlapack.so

locale:
 [1] LC_CTYPE=C                 LC_NUMERIC=C
 [3] LC_TIME=en_CA.UTF-8        LC_COLLATE=en_CA.UTF-8
 [5] LC_MONETARY=en_CA.UTF-8    LC_MESSAGES=en_CA.UTF-8
 [7] LC_PAPER=en_CA.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] Rfast_1.9.7        RcppZiggurat_0.1.5 Rcpp_1.0.3         future.apply_1.3.0
[5] future_1.15.1      colorout_1.2-1     BiocManager_1.30.4

loaded via a namespace (and not attached):
[1] compiler_3.6.1   parallel_3.6.1   tools_3.6.1      listenv_0.7.0
[5] codetools_0.2-16 digest_0.6.22    globals_0.12.4
ManosPapadakis95 commented 4 years ago

Yes you are right. This is undefined behaviour. I will look further and notify you. Thanks for the error.

jwbowers commented 4 years ago

Same problem here. Just recording it in case it helps.

> library('future.apply')
  Loading required package: future
  > plan('multisession', workers=4)
  > test_l = future_lapply(1:32, function(x) sum(rnorm(1e6)))
  > library('Rfast')
  Loading required package: Rcpp
  Loading required package: RcppZiggurat
  > test_l = future_lapply(1:32, function(x) sum(rnorm(1e6)))
  Error: node stack overflow
> sessionInfo()
  R version 4.0.0 (2020-04-24)
  Platform: x86_64-apple-darwin17.0 (64-bit)
  Running under: macOS Catalina 10.15.4

  Matrix products: default
  BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
  LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

 locale:
  [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

  attached base packages:
  [1] stats     graphics  grDevices utils     datasets  methods   base

  other attached packages:
  [1] Rfast_1.9.9        RcppZiggurat_0.1.5 Rcpp_1.0.4.6       future.apply_1.5.0 future_1.17.0      nvimcom_0.9-88

  loaded via a namespace (and not attached):
   [1] rex_1.2.0        xml2_1.3.2       R6_2.4.1         globals_0.12.5   tools_4.0.0      parallel_4.0.0   cyclocomp_1.1.0
   [8] lintr_2.0.1      withr_2.2.0      remotes_2.1.1    lazyeval_0.2.2   assertthat_0.2.1 rprojroot_1.3-2  digest_0.6.25
  [15] crayon_1.3.4     processx_3.4.2   callr_3.4.3      ps_1.3.2         codetools_0.2-16 compiler_4.0.0   desc_1.2.0
  [22] backports_1.1.6  listenv_0.8.0
  >
sjmgarnier commented 4 years ago

I have a similar issue here as well. I receive a "C stack usage is too close to the limit" error when trying to use Rfast with future_*apply functions.

nevilamos commented 4 years ago

similar here: y=1982 myvals<-sample(c(rep(0,25),1900:2000),size = 10^8,replace = T) M1<-matrix(myvals,10^6,100) future_mapply(F2,M1,c(1960,1970,1980))

Error: node stack overflow Error during wrapup: node stack overflow

sessionInfo() R version 3.6.2 (2019-12-12) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 18363)

Matrix products: default

locale: [1] LC_COLLATE=English_Australia.1252 LC_CTYPE=English_Australia.1252 LC_MONETARY=English_Australia.1252 [4] LC_NUMERIC=C LC_TIME=English_Australia.1252

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] future.apply_1.6.0 future_1.17.0 Rfast_1.9.9 RcppZiggurat_0.1.5 Rcpp_1.0.4.6 microbenchmark_1.4-7

loaded via a namespace (and not attached): [1] compiler_3.6.2 parallel_3.6.2 tools_3.6.2 listenv_0.8.0 codetools_0.2-16 digest_0.6.25 globals_0.12.5

ChristophH commented 4 years ago

Same problem here. Wanted to include Rfast::qpois.reg in a package that also uses future_lapply. Now I get

Error: node stack overflow

even when not calling any Rfast functions. Is there anything I can do to help figure out what the problem is? Would love to be able to use the speedy quasi poisson regression provided by Rfast.

ChristophH commented 2 years ago

Rfast 2.0.6 does not cause the problem anymore. It seems like it had to do with the print.environment function which has been removed. For details, see this future issue.