Open santikka opened 4 months ago
I can reproduce the issue - outputs from CmdStan 2.34.1 are read fine, outputs from 2.35.0 cause problems.
Although I get a different error message (maybe because of using R 4.3.2)
Error in rstan::read_stan_csv(samples$output_files()) :
object 'n_kept' not found
I'll also note that the problem is made worse by the code at:
https://github.com/stan-dev/rstan/blob/9b8d8fe715f1d51063621e6ca01d6265b02356fe/rstan/rstan/R/stan_csv.R#L220 where if save_warmup
is not all 0 or all 1, n_kept
never gets assigned.
Thanks. I think this might be fixed by https://github.com/stan-dev/rstan/pull/1131. Or at least the save_warmup
issue, not sure about the n_kept
issue @martinmodrak ran into.
@jgabry Yes, this doesn't resolve the problamatic logic around n_kept
. Also #1131 seems likely to break the old .CSVs, though. I didn't do a full checkout, but notice:
as.integer(as.logical("0"))
# [1] NA
as.integer(as.logical("false"))
# [1] 0
Also #1131 seems likely to break the old .CSVs, though.
Ah good point. Perhaps we should revert #1131 and figure out a more general approach? @bgoodri @andrjohns we probably need to decide whether we want to keep supporting this functionality in RStan. If there continue to be changes in the CSV files created by CmdStan I don't think RStan can keep up with releases in order to make sure read_stan_csv
keeps working. Do we even need it anymore given that CmdStanR exists?
The approach I took for cmdstanr
for compatibility with both was to convert any true
/false
to 0/1, so that the parsing logic for the rest stayed the same
Note that using cmdstanr with brms relies on read_stan_csv. We probably should add a test with fixed CSVs from s couple CmdStan versions and it should be OK....
Dne st 31. 7. 2024 19:37 uživatel Andrew Johnson @.***> napsal:
The approach I took for cmdstanr for compatibility with both was to convert any true/false to 0/1 https://github.com/stan-dev/cmdstanr/blob/0e3a99a52bc5ad15d9e386e9993c10ad6905818b/R/csv.R#L843, so that the parsing logic for the rest stayed the same
— Reply to this email directly, view it on GitHub https://github.com/stan-dev/rstan/issues/1133#issuecomment-2261026771, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACILKU3PJQLPI4IAEB7ONHDZPEOFFAVCNFSM6AAAAABKVJVGVGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENRRGAZDMNZXGE . You are receiving this because you were mentioned.Message ID: @.***>
Note that using cmdstanr with brms relies on read_stan_csv.
It doesn't look like that's the case anymore (which makes sense, otherwise there would have been a ton of bug reports in brms
once 2.35 was released)
otherwise there would have been a ton of bug reports in
brms
once 2.35 was released
Yeah that would have been a mess!
I was getting the if (max(save_warmup) == 0L)
error and installed the latest version of rstan (from r-universe) and I'm now getting object 'n_kept' not found
error.
Is there a magical version I can use where I won't get either of these errors? Or is there a way I need to modify the input data to make it work?
I got the code+data combination I'm attempting from someone else and it was working at one point. I'm not sure what package version were used then. :-/
Sorry for the hassle! I'm honestly not sure. Originally rstan::read_stan_csv
only existed because there was no such thing as cmdstanr. So users needed a way to import CmdStan output into R. But now that cmdstanr has been around for years, it hasn't been a priority to keep the CmdStan csv files (which change sometimes) backwards compatible with rstan::read_stan_csv
nor has there been enough bandwidth among developers to keep rstan::read_stan_csv
up to date with changes to CmdStan.
Is this for a model fit with cmdstanr? Here's some code from @andrjohns to produce a named list of arrays of posterior samples, similar to what rstan would do:
That will hopefully allow you to avoid rstan::read_stan_csv
entirely.
@bgoodri In the next RStan release read_stan_csv
should probably be formally deprecated unless the plan is to keep updating it when CmdStan csv files change, but I don't think that's been the plan so far.
@katrinabrock in addition to my previous comment, another option could be to run an older version of cmdstan. I think it might just be as of 2.35.0 that this is an issue, so you could try installing e.g. 2.34.0. cmdstanr::install_cmdstan()
will let you specify version = "2.34.0"
.
@jgabry
Thanks for the quick response! switching cmdstanr to version 2.34.0 did not appear to resolve the issue. I'm a bit puzzled because the code I'm working with does not appear to be using cmdstanr. It appears to be using rstan only and passing an r object in as data. I will dig a bit more an post back if I discover anything useful.
Yeah changing the cmdstanr version would only work if the csv files were being produced by cmdstanr. It’s weird if this is only using rstan because read_stan_csv() shouldn’t be necessary to use at all in that case. So yeah, quite strange!
ok! I did some more digging and discovered a couple mistakes I made in my initial debugging: 1) I initially thought the script I was working on was calling rstan::stan when it was in fact calling rethinking::stan (https://github.com/rmcelreath/rethinking/blob/master/R/cmdstan_support.r#L104). 2) I switched my cmdstan versions, but forgot that cmdstan version mismatch will not automatically trigger recompilation of the model. Therefore, I was still running my code with models produced with 2.35.0 so of course this did not do anything to resolve the issue.
With this in mind, here is my advice to others (@santikka ) to workaround this issue:
For those hitting this issue from using the rethinking
package (traceback includes this line https://github.com/rmcelreath/rethinking/blob/master/R/cmdstan_support.r#L97). This is happening because rethinking is trying to convert the cmdstanr
output into a rstan
output which is no longer supporter (per @jgabry above).
To resolve this, either
1) add rstan_out=FALSE
to your rethinking::stan
call which will result in the call returning a cmdstanr
output instead of attempting to convert
OR
switch to set_ulam_cmdstan(FALSE)
which will take cmdstanr
out of the picture, use pure rstan
and thus the result format will be unchanged
OR
use the general solution below
Use cmdstan 3.34.0
or earlier.
Steps:
Step 1) (one time per machine) Install earlier version of cmdstanr. cmdstanr::install_cmdstan(version = "2.34.0")
.
Step 2) (every time run your legacy code) Activate the older version of cmdstan like cmdstanr::set_cmdstan_path(file.path(dirname(cmdstanr::cmdstan_path()), 'cmdstan-2.34.0'))
. (Should work cross platform, but only tested on Linux.). You can check that this worked by running cmdstanr::cmdstan_version()
. Result should be "2.34.0"
. It would make sense to put this cmdstanr::set_cmdstan_path(file...
at the top of any script where you're having this issue.
Step 3) (one time per model) Recompile any models that you compiled with an incompatible version of cmdstanr
. There are a few ways you can do this:
my-model.stan
file, you will see another file called my-model
in the same directory. When you delete this and re-run your model, it will automatically be re-created with the version of cmdstan
that is currently activated.force_recompile = TRUE
to either your cmdstanr::cmdstan_model(...)
call or your $compile
call. options("cmdstanr_force_recompile" = TRUE)
before running your model. This works even if you are not calling cmdstanr
directly, but via some other library such as rethinking
.Note: With Option B or C, once you've done all this recompiling once for each model with version 2.34.0
, you probably don't need to recompile every single time you run your code. To go back to not recompiling by default, remove force_recompile = TRUE
from your arguments or run options("cmdstanr_force_recompile" = FALSE)
.
Re-write your code to use only cmdstanr since it's better supported than rstan and much better supported than a hybrid.
Thanks so much for the detailed report! Hopefully this is useful to anyone else running into the same problem.
Summary:
rstan::read_stan_csv()
no longer works for for .csv files produced bycmdstanr
Description:
It seems that
rstan
expects thesave_warmup
field to be an integer value, butcmdstanr
instead writes this as a logical value (e.g.,false
) to the conversion to integer fails inrstan:::parse_stancsv_comments()
resulting in anNA
value forsave_warmup
. This used to work with earlier versions ofcmdstanr
/CmdStan
.Reproducible Steps:
Created on 2024-07-10 with reprex v2.1.1
RStan Version:
2.32.6
R Version:
R version 4.4.0 (2024-04-24 ucrt)
Operating System:
Windows 11 Pro 23H2