stan-dev / rstan

RStan, the R interface to Stan
https://mc-stan.org
1.04k stars 269 forks source link

vb() crashes R on 64bit Windows 10 #258

Closed paul-buerkner closed 7 years ago

paul-buerkner commented 8 years ago

When fitting models with vb(), R consistently crashes for some models after printing out the first line of Begin stochastic gradient ascent. This happens for instance for the following GLMM:

## also attaches rstan
library(brms)

## simple weibull GLMM
stancode <- make_stancode(time ~ age + sex + disease + (1|patient), 
                          data = kidney, family = weibull("log"))
stanmodel <- stan_model(model_code = stancode)
standata <- make_standata(time ~ age + sex + disease + (1|patient), 
                          data = kidney, family = weibull("log"))

## runs without error
fit1 <- sampling(stanmodel, data = standata)

## crashes R
fit2 <- vb(stanmodel, data = standata)
> sessionInfo()
R version 3.2.3 (2015-12-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

locale:
[1] LC_COLLATE=German_Germany.1252  LC_CTYPE=German_Germany.1252    LC_MONETARY=German_Germany.1252
[4] LC_NUMERIC=C                    LC_TIME=German_Germany.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] brms_0.7.0.9000 rstan_2.9.0     ggplot2_2.0.0  

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.3      lattice_0.20-33  MASS_7.3-45      grid_3.2.3       plyr_1.8.3       nlme_3.1-123     gtable_0.1.2    
 [8] stats4_3.2.3     scales_0.3.0     minqa_1.2.4      nloptr_1.0.4     Matrix_1.2-3     splines_3.2.3    statmod_1.4.22  
[15] lme4_1.1-10      tools_3.2.3      munsell_0.4.2    abind_1.4-3      parallel_3.2.3   inline_0.3.14    colorspace_1.2-6
[22] gridExtra_2.0.0 
jgabry commented 8 years ago

Thanks for reporting. This particular example doesn't crash on my Mac nor have I noticed vb crashing for other models, so this might just be a windows issue. Has anyone else experienced this?

On Fri, Jan 22, 2016 at 9:25 AM, Paul-Christian Bürkner < notifications@github.com> wrote:

When fitting models with vb(), R consistently crashes for some models after printing out the first line of Begin stochastic gradient ascent This happens for instance for the following GLMM:

also attaches rstan

library(brms)

simple weibull GLMMstancode <- make_stancode(time ~ age + sex + disease + (1|patient),

                      data = kidney, family = weibull("log"))stanmodel <- stan_model(model_code = stancode)standata <- make_standata(time ~ age + sex + disease + (1|patient),
                      data = kidney, family = weibull("log"))

runs without errorfit1 <- sampling(stanmodel, data = standata)

crashes Rfit2 <- vb(stanmodel, data = standata)

sessionInfo()R version 323 (2015-12-10)Platform: x86_64-w64-mingw32/x64 (64-bit)Running under: Windows >= 8 x64 (build 9200) locale: [1] LC_COLLATE=German_Germany1252 LC_CTYPE=German_Germany1252 LC_MONETARY=German_Germany1252 [4] LC_NUMERIC=C LC_TIME=German_Germany1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] brms_0709000 rstan_290 ggplot2_200 loaded via a namespace (and not attached): [1] Rcpp_0123 lattice_020-33 MASS_73-45 grid_323 plyr_183 nlme_31-123 gtable_012 [8] stats4_323 scales_030 minqa_124 nloptr_104 Matrix_12-3 splines_323 statmod_1422 [15] lme4_11-10 tools_323 munsell_042 abind_14-3 parallel_323 inline_0314 colorspace_12-6 [22] gridExtra_200

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/rstan/issues/258.

bgoodri commented 8 years ago

I can replicate this crash on Windows but not on Linux. However, on Linux the objective function takes on huge numbers and ADVI seems not to converge.

bgoodri commented 8 years ago

I can make it crash with cmdstan on Windows, so I am going to attempt to move this issue to stan-dev/stan rather than stan-dev/rstan.

bgoodri commented 8 years ago

Actually, I just opened a new issue at https://github.com/stan-dev/stan/issues/1758

bgoodri commented 8 years ago

Here is the output if that helps

$ ./weibull.exe variational data file=weibull.data.R random seed=1913258051
method = variational
  variational
    algorithm = meanfield (Default)
      meanfield
    iter = 10000 (Default)
    grad_samples = 1 (Default)
    elbo_samples = 100 (Default)
    eta = 1 (Default)
    adapt
      engaged = 1 (Default)
      iter = 50 (Default)
    tol_rel_obj = 0.01 (Default)
    eval_elbo = 100 (Default)
    output_samples = 1000 (Default)
id = 0 (Default)
data
  file = weibull.data.R
init = 2 (Default)
random
  seed = 1913258051
output
  file = output.csv (Default)
  diagnostic_file =  (Default)
  refresh = 100 (Default)

This is Automatic Differentiation Variational Inference.

(EXPERIMENTAL ALGORITHM: expect frequent updates to the procedure.)

Gradient evaluation took 0.001 seconds
1000 iterations under these settings should take 1 seconds.
Adjust your expectations accordingly!

Begin eta adaptation.
Iteration:   1 / 250 [  0%]  (Adaptation)
Iteration:  50 / 250 [ 20%]  (Adaptation)
Iteration: 100 / 250 [ 40%]  (Adaptation)
Iteration: 150 / 250 [ 60%]  (Adaptation)
Iteration: 200 / 250 [ 80%]  (Adaptation)
Iteration: 250 / 250 [100%]  (Adaptation)
Success! Found best value [eta = 0.1].

Begin stochastic gradient ascent.
  iter       ELBO   delta_ELBO_mean   delta_ELBO_med   notes
   100    -6e+026             1.000            1.000
   200    -3e+035             1.000            1.000
   300    -9e+029        124494.257            1.000
   400    -7e+026         93657.116         1145.695
   500    -6e+028         74925.890            1.000
   600    -6e+019     160718225.825         1145.695
   700    -1e+025     137758479.421            1.000
   800    -9e+023     120538671.192           13.583
   900    -5e+022     107145487.365           13.583
  1000    -1e+019      96431328.765           16.753
  1100    -6e+024      96431328.765           16.753   MAY BE DIVERGING... INSPECT ELBO
  1200    -4e+053      96431328.765           16.753   MAY BE DIVERGING... INSPECT ELBO
  1300    -7e+036  5275433547021831.000           16.753   MAY BE DIVERGING... INSPECT ELBO
  1400    -4e+025  5275452512627888.000           16.753   MAY BE DIVERGING... INSPECT ELBO
  1500    -7e+022  5275452512627940.000          520.533   MAY BE DIVERGING... INSPECT ELBO
  1600    -9e+018  5275452416235278.000          520.533   MAY BE DIVERGING... INSPECT ELBO
  1700    -2e+023  5275452416235278.000          520.533   MAY BE DIVERGING... INSPECT ELBO
  1800    -1e+026  5275452416235277.000          520.533   MAY BE DIVERGING... INSPECT ELBO
  1900    -2e+028  5275452416235275.000          520.533   MAY BE DIVERGING... INSPECT ELBO
  2000    -1e+022  5275452416396606.000          520.533   MAY BE DIVERGING... INSPECT ELBO
  2100    -4e+017  5275452416399370.000         8115.745   MAY BE DIVERGING... INSPECT ELBO
  2200    -1e+020  5275452416399370.000         8115.745   MAY BE DIVERGING... INSPECT ELBO
  2300    -1e+029   18965771520.472          520.533   MAY BE DIVERGING... INSPECT ELBO
  2400    -2e+019     691281072.496          520.533   MAY BE DIVERGING... INSPECT ELBO
  2500    -4e+020     691281020.539            1.000   MAY BE DIVERGING... INSPECT ELBO
  2600    -4e+019     691280209.933            1.000   MAY BE DIVERGING... INSPECT ELBO
  2700    -6e+016     691280273.930            9.687   MAY BE DIVERGING... INSPECT ELBO
  2800    -4e+020     691280273.930            9.687   MAY BE DIVERGING... INSPECT ELBO
  2900    -1e+018     691280310.307          364.763   MAY BE DIVERGING... INSPECT ELBO
  3000    -3e+023     691118588.463            9.687   MAY BE DIVERGING... INSPECT ELBO
  3100    -1e+027     691115825.537            1.000   MAY BE DIVERGING... INSPECT ELBO
  3200    -1e+020     692099666.864            9.687   MAY BE DIVERGING... INSPECT ELBO
  3300    -5e+020     692099666.834            9.687   MAY BE DIVERGING... INSPECT ELBO
  3400    -6e+024        983943.535            1.000   MAY BE DIVERGING... INSPECT ELBO
  3500    -6e+023        983944.194            7.551   MAY BE DIVERGING... INSPECT ELBO
  3600    -5e+015      13714836.708            7.551   MAY BE DIVERGING... INSPECT ELBO
  3700    -1e+074      13714772.711            1.000   MAY BE DIVERGING... INSPECT ELBO
Segmentation fault
bgoodri commented 8 years ago

All I can seem to coax out of a backtrace from gdb is

#0  0x74a8a05f in ?? ()
#1  0x74a8a00b in ?? ()
#2  0x00514a70 in std::ostreambuf_iterator<char, std::char_traits<char> > std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::_M_insert_float<double>(std::ostreambuf_iterator<char, std::char_traits<char> >, std::ios_base&, char, char, double) const ()
#3  0x0028e57c in ?? ()
Backtrace stopped: not enough registers or memory available to unwind further