stan-dev / rstan

RStan, the R interface to Stan
https://mc-stan.org
1.04k stars 268 forks source link

Windows R session crashes when assigning multiple runs to same output #844

Open andrjohns opened 4 years ago

andrjohns commented 4 years ago

Summary:

Following on from this issue in the brms repo, the R session on Windows will crash when repeatedly assigning to the same output.

This is most easily replicated with brms. Using this list of brms arguments: test_brm.RData, the crash can be replicated via:

options(mc.cores=4)
library(brms)
load("test_brm.RData")

out = do.call(brm,test_brm)
out = do.call(brm,test_brm)
out = do.call(brm,test_brm)

#Crashes on this call:
out = do.call(brm,test_brm)

However, when I extract the Stan code, data and initial values from the brms object into test_rstan.txt and do the same runs with 'pure' RStan, I don't get the same failures:

options(mc.cores=4)
library(rstan)
source("test_rstan.txt")

out = do.call(stan,test_rstan)
out = do.call(stan,test_rstan)
out = do.call(stan,test_rstan)

#Does not crash here
out = do.call(stan,test_rstan)

This leads me to believe the crash is related to the model compilation in some way, since the brms calls re-compile the model each time, but the rstan calls do not (even if I set rstan_options(auto_write = FALSE))

Contents of my Makevars & .Renviron files:

> readLines("~/.R/Makevars.win")
[1] "CXX14FLAGS += -O3"

> readLines("~/.Renviron")
[1] "PATH=\"${RTOOLS40_HOME}\\usr\\bin;${PATH}\""

Session Info:

> sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)

Matrix products: default

locale:
[1] LC_COLLATE=English_Australia.1252  LC_CTYPE=English_Australia.1252   
[3] LC_MONETARY=English_Australia.1252 LC_NUMERIC=C                      
[5] LC_TIME=English_Australia.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] rstan_2.21.2         ggplot2_3.3.2        StanHeaders_2.21.0-6

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.5         pillar_1.4.6       compiler_4.0.2     prettyunits_1.1.1  tools_4.0.2       
 [6] pkgbuild_1.1.0     jsonlite_1.7.0     lifecycle_0.2.0    tibble_3.0.3       gtable_0.3.0      
[11] pkgconfig_2.0.3    rlang_0.4.7        cli_2.0.2          rstudioapi_0.11    parallel_4.0.2    
[16] curl_4.3           loo_2.3.1          gridExtra_2.3      withr_2.2.0        dplyr_1.0.2       
[21] generics_0.0.2     vctrs_0.3.4        stats4_4.0.2       grid_4.0.2         tidyselect_1.1.0  
[26] glue_1.4.2         inline_0.3.15      R6_2.4.1           processx_3.4.3     fansi_0.4.1       
[31] callr_3.4.3        purrr_0.3.4        magrittr_1.5       codetools_0.2-16   matrixStats_0.56.0
[36] scales_1.1.1       ps_1.3.4           ellipsis_0.3.1     assertthat_0.2.1   colorspace_1.4-1  
[41] V8_3.2.0           RcppParallel_5.0.2 munsell_0.5.0      crayon_1.3.4      
andrjohns commented 4 years ago

Additionally, it looks like the crashes don't occur if I manually trigger a garbage collection between each brms run:

options(mc.cores=4)
library(brms)
load("test_brm.RData")

out = do.call(brm,test_brm)
gc()
out = do.call(brm,test_brm)
gc()
out = do.call(brm,test_brm)
gc()
out = do.call(brm,test_brm)
gc()
out = do.call(brm,test_brm)
gc()
out = do.call(brm,test_brm)
gc()
out = do.call(brm,test_brm)
brian-bucher commented 4 years ago

I did a little more playing around (see the 994 issue in brms), and found that if I call brm twice, then do a gc(), it crashes during the gc() step:

options(mc.cores=4)
library(brms)
load("test_brm.RData")

out = do.call(brm,test_brm)
out = do.call(brm,test_brm)
gc()    #crashes at this step

So, whether I do a 3rd do.call(brm, test_brm) or two of those and then a gc(), it's during the third step that the crash occurs.

andrjohns commented 4 years ago

An example of this crashing behaviour using just RStan-only code also recently popped up on the forums: https://discourse.mc-stan.org/t/random-but-consistent-c-stack-error-on-windows-10/17799/4

dbarneche commented 4 years ago

Another thing I just noticed that returned the C stack trace issue on the Windows VM from our GitHub Actions—I'm not quite sure it the above will break universally, but thought I'd post it here for you to give it a go as well. Even if I rename all objects differently, but remove each before running the next, it breaks.

options(mc.cores=4)
library(brms)
load("test_brm.RData")

out_1 = do.call(brm,test_brm)
rm(out_1)
out_2 = do.call(brm,test_brm)
rm(out_2)
out_3 = do.call(brm,test_brm)
rm(out_3)
out_4 = do.call(brm,test_brm) #crashes at this step
andrewdolman commented 4 years ago

I did a little more playing around (see the 994 issue in brms), and found that if I call brm twice, then do a gc(), it crashes during the gc() step:

options(mc.cores=4)
library(brms)
load("test_brm.RData")

out = do.call(brm,test_brm)
out = do.call(brm,test_brm)
gc()    #crashes at this step

So, whether I do a 3rd do.call(brm, test_brm) or two of those and then a gc(), it's during the third step that the crash occurs.

Just adding here that these steps crash my R session too.

JWiley commented 2 years ago

Has any progress been made on this? I am having very similar issues. For my personal use, I use cmdstanr which works well. However, I am working on a package and cmdstanr is a difficult setup to use / depend on for unit testing and Vignettes for CRAN and things like GitHub actions as R packages do not have a convenient way to specify they depend on cmdstanr and cmdstan being successfully installed.