`stan()` writes to the global namespace during sampling

billdenney commented 5 years ago

Summary:

stan() writes some object to the global namespace during sampling causing an error with the drake package (see ropensci/drake#960).

Description:

When trying to use stan() within a drake workflow, stan reaches outside its own namespace to write objects to the global namespace. It would be helpful if all writing could be to an object-specific namespace to enable the use of drake for reproducible analysis.

More details are in ropensci/drake#960.

Reproducible Steps:

library(rstan)
#> Loading required package: StanHeaders
#> Loading required package: ggplot2
#> rstan (Version 2.19.2, GitRev: 2e1f913d3ca3)
#> For execution on a local, multicore CPU with excess RAM we recommend calling
#> options(mc.cores = parallel::detectCores()).
#> To avoid recompilation of unchanged Stan programs, we recommend calling
#> rstan_options(auto_write = TRUE)
#> For improved execution time, we recommend calling
#> Sys.setenv(LOCAL_CPPFLAGS = '-march=native')
#> although this causes Stan to throw an error on a few processors.
library(drake)

plan_stan <-
  drake_plan(
    scode="
parameters {
  real y[2]; 
} 
model {
  y[1] ~ normal(0, 1);
  y[2] ~ double_exponential(0, 2);
} 
",
    fit1=stan(model_code = scode, iter = 10, verbose = FALSE)
  )
make(plan_stan)
#> target scode
#> target fit1
#> fail fit1
#> Error: Target `fit1` failed. Call `diagnose(fit1)` for details. Error message:
#>   cannot add bindings to a locked environment. 
#> Please read the "Self-invalidation" section of the make() help file.
diagnose(fit1)
#> $error
#> <simpleError in assign(mname, def, where): cannot add bindings to a locked environment. 
#> Please read the "Self-invalidation" section of the make() help file.>
make(plan_stan, lock_envir=FALSE)
#> target fit1
#> 
#> SAMPLING FOR MODEL '08aca439b1af079914fdcfd62fb992d8' NOW (CHAIN 1).
#> Chain 1: 
#> Chain 1: Gradient evaluation took 0 seconds
#> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
#> Chain 1: Adjust your expectations accordingly!
#> Chain 1: 
#> Chain 1: 
#> Chain 1: WARNING: No variance estimation is
#> Chain 1:          performed for num_warmup < 20
#> Chain 1: 
#> Chain 1: Iteration: 1 / 10 [ 10%]  (Warmup)
#> Chain 1: Iteration: 2 / 10 [ 20%]  (Warmup)
#> Chain 1: Iteration: 3 / 10 [ 30%]  (Warmup)
#> Chain 1: Iteration: 4 / 10 [ 40%]  (Warmup)
#> Chain 1: Iteration: 5 / 10 [ 50%]  (Warmup)
#> Chain 1: Iteration: 6 / 10 [ 60%]  (Sampling)
#> Chain 1: Iteration: 7 / 10 [ 70%]  (Sampling)
#> Chain 1: Iteration: 8 / 10 [ 80%]  (Sampling)
#> Chain 1: Iteration: 9 / 10 [ 90%]  (Sampling)
#> Chain 1: Iteration: 10 / 10 [100%]  (Sampling)
#> Chain 1: 
#> Chain 1:  Elapsed Time: 0 seconds (Warm-up)
#> Chain 1:                0 seconds (Sampling)
#> Chain 1:                0 seconds (Total)
#> Chain 1: 
#> 
#> SAMPLING FOR MODEL '08aca439b1af079914fdcfd62fb992d8' NOW (CHAIN 2).
#> Chain 2: 
#> Chain 2: Gradient evaluation took 0 seconds
#> Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
#> Chain 2: Adjust your expectations accordingly!
#> Chain 2: 
#> Chain 2: 
#> Chain 2: WARNING: No variance estimation is
#> Chain 2:          performed for num_warmup < 20
#> Chain 2: 
#> Chain 2: Iteration: 1 / 10 [ 10%]  (Warmup)
#> Chain 2: Iteration: 2 / 10 [ 20%]  (Warmup)
#> Chain 2: Iteration: 3 / 10 [ 30%]  (Warmup)
#> Chain 2: Iteration: 4 / 10 [ 40%]  (Warmup)
#> Chain 2: Iteration: 5 / 10 [ 50%]  (Warmup)
#> Chain 2: Iteration: 6 / 10 [ 60%]  (Sampling)
#> Chain 2: Iteration: 7 / 10 [ 70%]  (Sampling)
#> Chain 2: Iteration: 8 / 10 [ 80%]  (Sampling)
#> Chain 2: Iteration: 9 / 10 [ 90%]  (Sampling)
#> Chain 2: Iteration: 10 / 10 [100%]  (Sampling)
#> Chain 2: 
#> Chain 2:  Elapsed Time: 0 seconds (Warm-up)
#> Chain 2:                0 seconds (Sampling)
#> Chain 2:                0 seconds (Total)
#> Chain 2: 
#> 
#> SAMPLING FOR MODEL '08aca439b1af079914fdcfd62fb992d8' NOW (CHAIN 3).
#> Chain 3: 
#> Chain 3: Gradient evaluation took 0 seconds
#> Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
#> Chain 3: Adjust your expectations accordingly!
#> Chain 3: 
#> Chain 3: 
#> Chain 3: WARNING: No variance estimation is
#> Chain 3:          performed for num_warmup < 20
#> Chain 3: 
#> Chain 3: Iteration: 1 / 10 [ 10%]  (Warmup)
#> Chain 3: Iteration: 2 / 10 [ 20%]  (Warmup)
#> Chain 3: Iteration: 3 / 10 [ 30%]  (Warmup)
#> Chain 3: Iteration: 4 / 10 [ 40%]  (Warmup)
#> Chain 3: Iteration: 5 / 10 [ 50%]  (Warmup)
#> Chain 3: Iteration: 6 / 10 [ 60%]  (Sampling)
#> Chain 3: Iteration: 7 / 10 [ 70%]  (Sampling)
#> Chain 3: Iteration: 8 / 10 [ 80%]  (Sampling)
#> Chain 3: Iteration: 9 / 10 [ 90%]  (Sampling)
#> Chain 3: Iteration: 10 / 10 [100%]  (Sampling)
#> Chain 3: 
#> Chain 3:  Elapsed Time: 0 seconds (Warm-up)
#> Chain 3:                0 seconds (Sampling)
#> Chain 3:                0 seconds (Total)
#> Chain 3: 
#> 
#> SAMPLING FOR MODEL '08aca439b1af079914fdcfd62fb992d8' NOW (CHAIN 4).
#> Chain 4: 
#> Chain 4: Gradient evaluation took 0 seconds
#> Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds.
#> Chain 4: Adjust your expectations accordingly!
#> Chain 4: 
#> Chain 4: 
#> Chain 4: WARNING: No variance estimation is
#> Chain 4:          performed for num_warmup < 20
#> Chain 4: 
#> Chain 4: Iteration: 1 / 10 [ 10%]  (Warmup)
#> Chain 4: Iteration: 2 / 10 [ 20%]  (Warmup)
#> Chain 4: Iteration: 3 / 10 [ 30%]  (Warmup)
#> Chain 4: Iteration: 4 / 10 [ 40%]  (Warmup)
#> Chain 4: Iteration: 5 / 10 [ 50%]  (Warmup)
#> Chain 4: Iteration: 6 / 10 [ 60%]  (Sampling)
#> Chain 4: Iteration: 7 / 10 [ 70%]  (Sampling)
#> Chain 4: Iteration: 8 / 10 [ 80%]  (Sampling)
#> Chain 4: Iteration: 9 / 10 [ 90%]  (Sampling)
#> Chain 4: Iteration: 10 / 10 [100%]  (Sampling)
#> Chain 4: 
#> Chain 4:  Elapsed Time: 0 seconds (Warm-up)
#> Chain 4:                0 seconds (Sampling)
#> Chain 4:                0 seconds (Total)
#> Chain 4:
#> Warning: target fit1 warnings:
#>   The largest R-hat is 1.58, indicating chains have not mixed.
#> Running the chains for more iterations may help. See
#> http://mc-stan.org/misc/warnings.html#r-hat
#>   Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
#> Running the chains for more iterations may help. See
#> http://mc-stan.org/misc/warnings.html#bulk-ess
#>   Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
#> Running the chains for more iterations may help. See
#> http://mc-stan.org/misc/warnings.html#tail-ess
#> Target fit1 messages:
#>   recompiling to avoid crashing R session
diagnose(fit1)
#> $name
#> [1] "fit1"
#> 
#> $target
#> [1] "fit1"
#> 
#> $imported
#> [1] FALSE
#> 
#> $missing
#> [1] TRUE
#> 
#> $seed
#> [1] 757455598
#> 
#> $time_start
#>    user  system elapsed 
#>    2.54    0.51   54.92 
#> 
#> $file_out
#> NULL
#> 
#> $isfile
#> [1] FALSE
#> 
#> $trigger
#> $trigger$command
#> [1] TRUE
#> 
#> $trigger$depend
#> [1] TRUE
#> 
#> $trigger$file
#> [1] TRUE
#> 
#> $trigger$condition
#> [1] FALSE
#> 
#> $trigger$change
#> NULL
#> 
#> $trigger$mode
#> [1] "whitelist"
#> 
#> 
#> $command
#> [1] "stan(model_code = scode, iter = 10, verbose = FALSE)"
#> 
#> $dependency_hash
#> [1] "060321ad3b4f11bf"
#> 
#> $input_file_hash
#> [1] ""
#> 
#> $output_file_hash
#> [1] ""
#> 
#> $time_command
#> $time_command$target
#> [1] "fit1"
#> 
#> $time_command$elapsed
#> [1] 50.36
#> 
#> $time_command$user
#> [1] 0.74
#> 
#> $time_command$system
#> [1] 0.03
#> 
#> 
#> $warnings
#> [1] "The largest R-hat is 1.58, indicating chains have not mixed.\nRunning the chains for more iterations may help. See\nhttp://mc-stan.org/misc/warnings.html#r-hat"                                                         
#> [2] "Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.\nRunning the chains for more iterations may help. See\nhttp://mc-stan.org/misc/warnings.html#bulk-ess"           
#> [3] "Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.\nRunning the chains for more iterations may help. See\nhttp://mc-stan.org/misc/warnings.html#tail-ess"
#> 
#> $messages
#> [1] "recompiling to avoid crashing R session"
#> 
#> $time_build
#> $time_build$target
#> [1] "fit1"
#> 
#> $time_build$elapsed
#> [1] 50.61
#> 
#> $time_build$user
#> [1] 0.99
#> 
#> $time_build$system
#> [1] 0.03

^{Created on 2019-07-28 by the reprex package (v0.3.0)}

Session info

``` r devtools::session_info() #> - Session info ---------------------------------------------------------- #> setting value #> version R version 3.6.1 (2019-07-05) #> os Windows 10 x64 #> system x86_64, mingw32 #> ui RTerm #> language (EN) #> collate English_United States.1252 #> ctype English_United States.1252 #> tz America/New_York #> date 2019-07-28 #> #> - Packages -------------------------------------------------------------- #> package * version date lib source #> assertthat 0.2.1 2019-03-21 [1] CRAN (R 3.6.0) #> backports 1.1.4 2019-04-10 [1] CRAN (R 3.6.0) #> base64url 1.4 2018-05-14 [1] CRAN (R 3.6.0) #> callr 3.3.1 2019-07-18 [1] CRAN (R 3.6.1) #> cli 1.1.0 2019-03-19 [1] CRAN (R 3.6.0) #> codetools 0.2-16 2018-12-24 [2] CRAN (R 3.6.1) #> colorspace 1.4-1 2019-03-18 [1] CRAN (R 3.6.0) #> crayon 1.3.4 2017-09-16 [1] CRAN (R 3.6.0) #> desc 1.2.0 2018-05-01 [1] CRAN (R 3.6.0) #> devtools 2.1.0 2019-07-06 [1] CRAN (R 3.6.1) #> digest 0.6.20 2019-07-04 [1] CRAN (R 3.6.1) #> dplyr 0.8.3 2019-07-04 [1] CRAN (R 3.6.1) #> drake * 7.4.0 2019-06-07 [1] CRAN (R 3.6.0) #> evaluate 0.14 2019-05-28 [1] CRAN (R 3.6.1) #> fs 1.3.1 2019-05-06 [1] CRAN (R 3.6.1) #> ggplot2 * 3.2.0 2019-06-16 [1] CRAN (R 3.6.0) #> glue 1.3.1 2019-03-12 [1] CRAN (R 3.6.0) #> gridExtra 2.3 2017-09-09 [1] CRAN (R 3.6.0) #> gtable 0.3.0 2019-03-25 [1] CRAN (R 3.6.0) #> highr 0.8 2019-03-20 [1] CRAN (R 3.6.0) #> htmltools 0.3.6 2017-04-28 [1] CRAN (R 3.6.0) #> igraph 1.2.4.1 2019-04-22 [1] CRAN (R 3.6.0) #> inline 0.3.15 2018-05-18 [1] CRAN (R 3.6.0) #> knitr 1.23 2019-05-18 [1] CRAN (R 3.6.1) #> lazyeval 0.2.2 2019-03-15 [1] CRAN (R 3.6.0) #> loo 2.1.0 2019-03-13 [1] CRAN (R 3.6.0) #> magrittr 1.5 2014-11-22 [1] CRAN (R 3.6.0) #> matrixStats 0.54.0 2018-07-23 [1] CRAN (R 3.6.0) #> memoise 1.1.0 2017-04-21 [1] CRAN (R 3.6.0) #> munsell 0.5.0 2018-06-12 [1] CRAN (R 3.6.0) #> pillar 1.4.2 2019-06-29 [1] CRAN (R 3.6.1) #> pkgbuild 1.0.3 2019-03-20 [1] CRAN (R 3.6.0) #> pkgconfig 2.0.2 2018-08-16 [1] CRAN (R 3.6.0) #> pkgload 1.0.2 2018-10-29 [1] CRAN (R 3.6.0) #> prettyunits 1.0.2 2015-07-13 [1] CRAN (R 3.6.0) #> processx 3.4.1 2019-07-18 [1] CRAN (R 3.6.1) #> ps 1.3.0 2018-12-21 [1] CRAN (R 3.6.0) #> purrr 0.3.2 2019-03-15 [1] CRAN (R 3.6.0) #> R6 2.4.0 2019-02-14 [1] CRAN (R 3.6.0) #> Rcpp 1.0.1 2019-03-17 [1] CRAN (R 3.6.0) #> remotes 2.1.0 2019-06-24 [1] CRAN (R 3.6.1) #> rlang 0.4.0 2019-06-25 [1] CRAN (R 3.6.1) #> rmarkdown 1.14 2019-07-12 [1] CRAN (R 3.6.1) #> rprojroot 1.3-2 2018-01-03 [1] CRAN (R 3.6.0) #> rstan * 2.19.2 2019-07-09 [1] CRAN (R 3.6.1) #> scales 1.0.0 2018-08-09 [1] CRAN (R 3.6.0) #> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 3.6.0) #> StanHeaders * 2.18.1-10 2019-06-14 [1] CRAN (R 3.6.1) #> storr 1.2.1 2018-10-18 [1] CRAN (R 3.6.0) #> stringi 1.4.3 2019-03-12 [1] CRAN (R 3.6.0) #> stringr 1.4.0 2019-02-10 [1] CRAN (R 3.6.0) #> testthat 2.1.1 2019-04-23 [1] CRAN (R 3.6.0) #> tibble 2.1.3 2019-06-06 [1] CRAN (R 3.6.1) #> tidyselect 0.2.5 2018-10-11 [1] CRAN (R 3.6.0) #> usethis 1.5.1 2019-07-04 [1] CRAN (R 3.6.1) #> withr 2.1.2 2018-03-15 [1] CRAN (R 3.6.0) #> xfun 0.8 2019-06-25 [1] CRAN (R 3.6.1) #> yaml 2.2.0 2018-07-25 [1] CRAN (R 3.6.0) #> #> [1] C:/Users/Bill Denney/Documents/R/win-library/3.6 #> [2] C:/Program Files/R/R-3.6.1/library ```

Current Output:

An error is the current output.

Expected Output:

I expect the model to run and return a stanfit object.

RStan Version:

The version of RStan you are running: 2.19.2, GitRev: 2e1f913d3ca3

R Version:

The version of R you are running: 3.6.1 (2019-07-05)

Operating System:

Your operating system: Windows 10 x64

billdenney commented 5 years ago

Thanks to some further investigation by @wlandau, he found that a class was being defined during sampling. I believe that the issue is related to the following lines:

https://github.com/stan-dev/rstan/blob/98fa82efee6320adae1289eeb9566ec60f6d6c99/rstan3/R/AllClass.R#L209-L210

Given that the class is defined to be named as a md5, I do assume that it can't be defined in advance. But, could the class be placed into a model-specific namespace instead of the global namespace?

mcol commented 5 years ago

I think the error is somewhere else, as the rstan3 code is not used in the current package. You should look instead under rstan/rstan.

A quick search for .GlobalEnv or globalenv() didn't reveal any obvious culprit to me (the places I found were used for reading from rather than writing to), but I don't really know the internals, so that writing to the global namespace may happen somewhere else.

billdenney commented 5 years ago

I think that I traced the issue to this line:

https://github.com/stan-dev/rstan/blob/04b4210af576a35f0b2812fe1dd616a2834b020f/rstan/rstan/R/stanmodel-class.R#L104

If I am following the rest of the error trace (https://github.com/ropensci/drake/issues/960#issuecomment-544301968), this relates to loading an Rcpp shared library and being able to access shared library. The assignment to the global environment appears to happen via Rcpp, but I wonder if it can be avoided in some way?

bgoodri commented 4 years ago

I am not sure. That is really old code, and I'm not sure what the reasons were for putting the module into the global environment, but Rcpp Modules are pretty fragile.

stan-dev / rstan