Unable to replicate the 'SBC for brms models'

rroyaute commented 1 month ago

Hi, Thank you for a really instructive package! I am having some trouble implementing SBC when using brms. It seems that everyting runs fine when using cmdstanr, though I am much less knowledgable in stan and would prefer sticking to brms when possible. Using the first example just returns the following errors: Error in rstan::read_stan_csv(fit$output_files()) : object 'n_kept' not found. In addition: Warning message: In parse_stancsv_comments(comments) : NAs introduced by coercion

The second example (the one I hope to be implementing for my purposes) just returns an empty list for the first two elements:

results_func$stats
[1] sim_id          rhat            ess_bulk        ess_tail       
[5] rank            simulated_value max_rank       
<0 rows> (or 0-length row.names)

All packages seem to be up to date and I don't have any other issues running brms models otherwise. I'm not sure what may be going wrong here. Any suggestions would be much appreciated! Thanks, Raph

sessionInfo()
R version 4.4.1 (2024-06-14)
Platform: aarch64-apple-darwin20
Running under: macOS Sonoma 14.5

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Europe/Paris
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] cmdstanr_0.8.1.9000     rstan_2.35.0.9000       StanHeaders_2.35.0.9000
[4] future_1.33.2           ggplot2_3.5.1           brms_2.21.6            
[7] Rcpp_1.0.13             SBC_0.3.0.9000         

loaded via a namespace (and not attached):
 [1] gtable_0.3.5          tensorA_0.36.2.1      xfun_0.46            
 [4] QuickJSR_1.3.1        processx_3.8.4        inline_0.3.19        
 [7] lattice_0.22-6        ps_1.7.7              vctrs_0.6.5          
[10] tools_4.4.1           generics_0.1.3        stats4_4.4.1         
[13] curl_5.2.1            parallel_4.4.1        tibble_3.2.1         
[16] fansi_1.0.6           pkgconfig_2.0.3       Matrix_1.7-0         
[19] data.table_1.15.4     checkmate_2.3.1       distributional_0.4.0 
[22] RcppParallel_5.1.8    lifecycle_1.0.4       farver_2.1.2         
[25] compiler_4.4.1        stringr_1.5.1         Brobdingnag_1.2-9    
[28] munsell_0.5.1         codetools_0.2-20      bayesplot_1.11.1.9000
[31] tidyr_1.3.1           pillar_1.9.0          cachem_1.1.0         
[34] bridgesampling_1.1-2  abind_1.4-5           nlme_3.1-165         
[37] parallelly_1.38.0     posterior_1.6.0       tidyselect_1.2.1     
[40] digest_0.6.36         mvtnorm_1.2-5         stringi_1.8.4        
[43] purrr_1.0.2           dplyr_1.1.4           listenv_0.9.1        
[46] labeling_0.4.3        fastmap_1.2.0         grid_4.4.1           
[49] colorspace_2.1-1      cli_3.6.3             magrittr_2.0.3       
[52] loo_2.8.0.9000        pkgbuild_1.4.4        utf8_1.2.4           
[55] future.apply_1.11.2   withr_3.0.0           scales_1.3.0         
[58] backports_1.5.0       estimability_1.5.1    matrixStats_1.3.0    
[61] emmeans_1.10.3        globals_0.16.3        gridExtra_2.3        
[64] coda_0.19-4.1         memoise_2.0.1         knitr_1.48           
[67] V8_4.4.2              rstantools_2.4.0.9000 rlang_1.1.4          
[70] xtable_1.8-4          glue_1.7.0            rstudioapi_0.16.0    
[73] jsonlite_1.8.8        R6_2.5.1

martinmodrak commented 1 month ago

Thanks for reporting!

Unfortunately, I don't have access to a macOS machine. I just checked that the vignettes work on Windows and Linux (with quite similar main package versions). I'd like to make sure this isn't an issue with your Stan installation. Are you able to run normal brms models in the same project?

Also a quick note - you can use brms with either cmdstanr or rstan as its backend (via the backend argument to brm or options(brms.backend = XXX) call) and you very rarely have to worry about the differences. So if it works with brms + cmdstanr, then no real need to have it work with brms + rstan as well.

rroyaute commented 1 month ago

Thank you for your reply! My models seem to compile fine regardless of the backend. I am using cmdstan as a default in my case so am a bit puzzled as to why I'm getting an rstan error (I've set cmdstan as a global option as shown in the vignette with use_cmdstanr <- getOption("SBC.vignettes_cmdstanr", TRUE).

I've ran a number of additional tests:

Run all examples with rstan. For example 1, the generate_dataset() still fails but with a different error message this time: Error: Some 'draw_ids' indices are out of range.The second example runs fine and I'm able to replicate the figures without problems.
Run again with cmdstanr for a sanity check. Again, the first example complains about not finding object 'n_kept' and the second example doesn't fare better. Here's the detailed output for results_func():

---- End of output for simulation 5 -----
Too many simulations produced errors. Further error messages not shown.
 - 100 (100%) fits resulted in an error. Not all diagnostics are OK. ou can learn more by inspecting $default_diagnostics, $backend_diagnostics and/or investigating $outputs/$messages/$warnings for detailed output from the backend.
Warning message: In compute_SBC(datasets_func, backend_func, dquants = log_lik_dq_func,  :  All simulations produced error when fitting

In summary, with cmdstanr neither examples work but they fail at different points in the script (example 1 at datasets_func(), example 2 at results_func(). With rstan, the first example fails at generate_dataset(), but the second example runs fine!

I have no idea what might be going wrong here! I'm running my own model through rstan now and will update if anything unexpected comes about. Otherwise, I guess my only option would be reinstalling stan and cmdstan, right?

martinmodrak commented 1 month ago

Oh, I see the source for 'Error: Some 'draw_ids' indices are out of range.' Could you try with the latest version from the main branch?

martinmodrak commented 1 month ago

For the cmdstanr case, could you share the contents of results_func$errors, results_func$warnings and results_func$outputs (most likely there's a lot of repetition, so just the first elements might be enough).

rroyaute commented 1 month ago

Thanks! I'll retry the rstan version from the main branch in just a bit. Here are the errors I get with the cmdstanr case: <simpleError in rstan::read_stan_csv(fit$output_files()): object 'n_kept' not found>. The chains are running fine so this is not the issue. For some reason the results don't get written into $stats and $fits.

> results_func$outputs
[[100]]
 [1] "Running MCMC with 1 chain..."                      
 [2] ""                                                  
 [3] "Chain 1 Iteration:    1 / 2000 [  0%]  (Warmup) "  
 [4] "Chain 1 Iteration:  100 / 2000 [  5%]  (Warmup) "  
 [5] "Chain 1 Iteration:  200 / 2000 [ 10%]  (Warmup) "  
 [6] "Chain 1 Iteration:  300 / 2000 [ 15%]  (Warmup) "  
 [7] "Chain 1 Iteration:  400 / 2000 [ 20%]  (Warmup) "  
 [8] "Chain 1 Iteration:  500 / 2000 [ 25%]  (Warmup) "  
 [9] "Chain 1 Iteration:  600 / 2000 [ 30%]  (Warmup) "  
[10] "Chain 1 Iteration:  700 / 2000 [ 35%]  (Warmup) "  
[11] "Chain 1 Iteration:  800 / 2000 [ 40%]  (Warmup) "  
[12] "Chain 1 Iteration:  900 / 2000 [ 45%]  (Warmup) "  
[13] "Chain 1 Iteration: 1000 / 2000 [ 50%]  (Warmup) "  
[14] "Chain 1 Iteration: 1001 / 2000 [ 50%]  (Sampling) "
[15] "Chain 1 Iteration: 1100 / 2000 [ 55%]  (Sampling) "
[16] "Chain 1 Iteration: 1200 / 2000 [ 60%]  (Sampling) "
[17] "Chain 1 Iteration: 1300 / 2000 [ 65%]  (Sampling) "
[18] "Chain 1 Iteration: 1400 / 2000 [ 70%]  (Sampling) "
[19] "Chain 1 Iteration: 1500 / 2000 [ 75%]  (Sampling) "
[20] "Chain 1 Iteration: 1600 / 2000 [ 80%]  (Sampling) "
[21] "Chain 1 Iteration: 1700 / 2000 [ 85%]  (Sampling) "
[22] "Chain 1 Iteration: 1800 / 2000 [ 90%]  (Sampling) "
[23] "Chain 1 Iteration: 1900 / 2000 [ 95%]  (Sampling) "
[24] "Chain 1 Iteration: 2000 / 2000 [100%]  (Sampling) "
[25] "Chain 1 finished in 0.1 seconds."

rroyaute commented 1 month ago

Oh, I see the source for 'Error: Some 'draw_ids' indices are out of range.' Could you try with the latest version from the main branch?

Everything runs fine with rstan now, I made a silly mistake in the call to the backend function with compute_sbc()!

martinmodrak commented 1 month ago

So I'd guess the error you are getting with cmdstanr (the object 'n_kept' not found one) is not an SBC problem but rather cmdstanr/rstan/brms thing. I can't reproduce it myself, but here's some background:

When brms uses cmdstanr as backend, it will make cmdstanr write out .csv output files and then load those via rstan::read_stan_csv (this is why there's rstan involved even when you use cmdstanr)
The error likely stems from here: https://github.com/stan-dev/rstan/blob/9b8d8fe715f1d51063621e6ca01d6265b02356fe/rstan/rstan/R/stan_csv.R#L226 - note that n_kept is assigned only in the if ... else if bloack above and if save_warmup is in some messy state, n_kept may never be assigned.
So my guess is that for some reason, there is some mismatch between the CSV generated by CmdStan and what read_stan_csv expects

Could you verify that you get the error also when you run simple things like

df <- data.frame(y = rnorm(10))
brms::brm(y ~ 1, data = df, backend = "cmdstanr")

If that's true, I'd ask you to file a bug at the rstan (or discuss on Stan forums), since you are the one who's seeing it and thus can help resolve it.

Also, what's your cmdstanr::cmdstan_version()?

rroyaute commented 1 month ago

Thanks! Using cmdstan version 2.35.0, I don't get any errors. It definetely seems it's some miscommunication between cmdstanr and rstan as you said. I'll drop an issue at rstan later on. Thanks for the help!

> set.seed(213452)
> df <- data.frame(y = rnorm(10))
> brms::brm(y ~ 1, data = df, backend = "cmdstanr")

Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: y ~ 1 
   Data: df (Number of observations: 10) 
  Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup draws = 4000

Regression Coefficients:
          Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept    -0.04      0.31    -0.67     0.58 1.00     2093     1293

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma     0.94      0.26     0.58     1.62 1.00     1598     1577

Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

martinmodrak commented 1 month ago

OK, I was a bit too quick, inspecting the code I realized it is SBC who's calling read_stan_csv (I forgot we overrode that behaviour from brms, but it was in fact necessary). And I can reproduce the error myself.

martinmodrak commented 1 month ago

And this is a known problem: https://github.com/stan-dev/rstan/issues/1133 Seems like downgrading CmdStan to 2.34.1 avoids the problem.

rroyaute commented 1 month ago

Thank you! I'm fine sticking to rstan for now, so hopefully the compatibility between CmdStan's newest version and rstan gets resolved by the time I'm ready to calibrate models requiring more computing power.

martinmodrak commented 1 month ago

So it turns out that the issue can be completely worked around on the SBC side by using brms::read_csv_as_stanfit() (which wasn't available when I first made the implementation of brms backends). The current version on master should have the issue fixed.

hyunjimoon / SBC

Unable to replicate the 'SBC for brms models' #104