Closed rroyaute closed 1 month ago
Thanks for reporting!
Unfortunately, I don't have access to a macOS machine. I just checked that the vignettes work on Windows and Linux (with quite similar main package versions). I'd like to make sure this isn't an issue with your Stan installation. Are you able to run normal brms
models in the same project?
Also a quick note - you can use brms
with either cmdstanr
or rstan
as its backend (via the backend
argument to brm
or options(brms.backend = XXX)
call) and you very rarely have to worry about the differences. So if it works with brms
+ cmdstanr
, then no real need to have it work with brms
+ rstan
as well.
Thank you for your reply!
My models seem to compile fine regardless of the backend. I am using cmdstan
as a default in my case so am a bit puzzled as to why I'm getting an rstan
error (I've set cmdstan as a global option as shown in the vignette with use_cmdstanr <- getOption("SBC.vignettes_cmdstanr", TRUE)
.
I've ran a number of additional tests:
rstan
. For example 1, the generate_dataset()
still fails but with a different error message this time: Error: Some 'draw_ids' indices are out of range.
The second example runs fine and I'm able to replicate the figures without problems.cmdstanr
for a sanity check. Again, the first example complains about not finding object 'n_kept'
and the second example doesn't fare better. Here's the detailed output for results_func()
: ---- End of output for simulation 5 -----
Too many simulations produced errors. Further error messages not shown.
- 100 (100%) fits resulted in an error. Not all diagnostics are OK. ou can learn more by inspecting $default_diagnostics, $backend_diagnostics and/or investigating $outputs/$messages/$warnings for detailed output from the backend.
Warning message: In compute_SBC(datasets_func, backend_func, dquants = log_lik_dq_func, : All simulations produced error when fitting
In summary, with cmdstanr
neither examples work but they fail at different points in the script (example 1 at datasets_func()
, example 2 at results_func()
. With rstan, the first example fails at generate_dataset()
, but the second example runs fine!
I have no idea what might be going wrong here! I'm running my own model through rstan now and will update if anything unexpected comes about. Otherwise, I guess my only option would be reinstalling stan
and cmdstan
, right?
Oh, I see the source for 'Error: Some 'draw_ids' indices are out of range.' Could you try with the latest version from the main branch?
For the cmdstanr
case, could you share the contents of results_func$errors
, results_func$warnings
and results_func$outputs
(most likely there's a lot of repetition, so just the first elements might be enough).
Thanks! I'll retry the rstan version from the main branch in just a bit. Here are the errors I get with the cmdstanr
case:
<simpleError in rstan::read_stan_csv(fit$output_files()): object 'n_kept' not found>
. The chains are running fine so this is not the issue. For some reason the results don't get written into $stats
and $fits
.
> results_func$outputs
[[100]]
[1] "Running MCMC with 1 chain..."
[2] ""
[3] "Chain 1 Iteration: 1 / 2000 [ 0%] (Warmup) "
[4] "Chain 1 Iteration: 100 / 2000 [ 5%] (Warmup) "
[5] "Chain 1 Iteration: 200 / 2000 [ 10%] (Warmup) "
[6] "Chain 1 Iteration: 300 / 2000 [ 15%] (Warmup) "
[7] "Chain 1 Iteration: 400 / 2000 [ 20%] (Warmup) "
[8] "Chain 1 Iteration: 500 / 2000 [ 25%] (Warmup) "
[9] "Chain 1 Iteration: 600 / 2000 [ 30%] (Warmup) "
[10] "Chain 1 Iteration: 700 / 2000 [ 35%] (Warmup) "
[11] "Chain 1 Iteration: 800 / 2000 [ 40%] (Warmup) "
[12] "Chain 1 Iteration: 900 / 2000 [ 45%] (Warmup) "
[13] "Chain 1 Iteration: 1000 / 2000 [ 50%] (Warmup) "
[14] "Chain 1 Iteration: 1001 / 2000 [ 50%] (Sampling) "
[15] "Chain 1 Iteration: 1100 / 2000 [ 55%] (Sampling) "
[16] "Chain 1 Iteration: 1200 / 2000 [ 60%] (Sampling) "
[17] "Chain 1 Iteration: 1300 / 2000 [ 65%] (Sampling) "
[18] "Chain 1 Iteration: 1400 / 2000 [ 70%] (Sampling) "
[19] "Chain 1 Iteration: 1500 / 2000 [ 75%] (Sampling) "
[20] "Chain 1 Iteration: 1600 / 2000 [ 80%] (Sampling) "
[21] "Chain 1 Iteration: 1700 / 2000 [ 85%] (Sampling) "
[22] "Chain 1 Iteration: 1800 / 2000 [ 90%] (Sampling) "
[23] "Chain 1 Iteration: 1900 / 2000 [ 95%] (Sampling) "
[24] "Chain 1 Iteration: 2000 / 2000 [100%] (Sampling) "
[25] "Chain 1 finished in 0.1 seconds."
Oh, I see the source for 'Error: Some 'draw_ids' indices are out of range.' Could you try with the latest version from the main branch?
Everything runs fine with rstan
now, I made a silly mistake in the call to the backend function with compute_sbc()
!
So I'd guess the error you are getting with cmdstanr
(the object 'n_kept' not found
one) is not an SBC problem but rather cmdstanr/rstan/brms thing. I can't reproduce it myself, but here's some background:
brms
uses cmdstanr
as backend, it will make cmdstanr
write out .csv output files and then load those via rstan::read_stan_csv
(this is why there's rstan
involved even when you use cmdstanr
)n_kept
is assigned only in the if ... else if
bloack above and if save_warmup
is in some messy state, n_kept
may never be assigned.read_stan_csv
expectsCould you verify that you get the error also when you run simple things like
df <- data.frame(y = rnorm(10))
brms::brm(y ~ 1, data = df, backend = "cmdstanr")
If that's true, I'd ask you to file a bug at the rstan
(or discuss on Stan forums), since you are the one who's seeing it and thus can help resolve it.
Also, what's your cmdstanr::cmdstan_version()
?
Thanks! Using cmdstan version 2.35.0, I don't get any errors. It definetely seems it's some miscommunication between cmdstanr
and rstan
as you said. I'll drop an issue at rstan
later on. Thanks for the help!
> set.seed(213452)
> df <- data.frame(y = rnorm(10))
> brms::brm(y ~ 1, data = df, backend = "cmdstanr")
Family: gaussian
Links: mu = identity; sigma = identity
Formula: y ~ 1
Data: df (Number of observations: 10)
Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup draws = 4000
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept -0.04 0.31 -0.67 0.58 1.00 2093 1293
Further Distributional Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma 0.94 0.26 0.58 1.62 1.00 1598 1577
Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
OK, I was a bit too quick, inspecting the code I realized it is SBC
who's calling read_stan_csv
(I forgot we overrode that behaviour from brms
, but it was in fact necessary). And I can reproduce the error myself.
And this is a known problem: https://github.com/stan-dev/rstan/issues/1133 Seems like downgrading CmdStan to 2.34.1 avoids the problem.
Thank you! I'm fine sticking to rstan for now, so hopefully the compatibility between CmdStan's newest version and rstan gets resolved by the time I'm ready to calibrate models requiring more computing power.
So it turns out that the issue can be completely worked around on the SBC side by using brms::read_csv_as_stanfit()
(which wasn't available when I first made the implementation of brms
backends). The current version on master should have the issue fixed.
Hi, Thank you for a really instructive package! I am having some trouble implementing SBC when using
brms
. It seems that everyting runs fine when usingcmdstanr
, though I am much less knowledgable in stan and would prefer sticking to brms when possible. Using the first example just returns the following errors:Error in rstan::read_stan_csv(fit$output_files()) : object 'n_kept' not found. In addition: Warning message: In parse_stancsv_comments(comments) : NAs introduced by coercion
The second example (the one I hope to be implementing for my purposes) just returns an empty list for the first two elements:
All packages seem to be up to date and I don't have any other issues running brms models otherwise. I'm not sure what may be going wrong here. Any suggestions would be much appreciated! Thanks, Raph