Open jburos opened 6 years ago
Thanks for this! There are a couple of different functions provided by rstan for extracting draws from a stanfit object, and it looks like for stanfit objects created by read_stan_csv
, some functions will give names with []
and some with .
. It should be easy enough for me to switch tidybayes to use one of the functions that gives the names with []
, which should fix spread_draws
and other functions relying on that syntax.
It's been a while since this issue was addressed, but posting my work-around here in case it's helpful to others.
My process now is:
readr::read_csv
.
coda::as.mcmc()
or coda::as.mcmc.list()
Here is my current .repair_names
function, in case it helps anyone:
#' change incoming format (`parameter.i.j`) to standard format (`parameter[i,j]`)
#' @param .names character or list of character names
#' @return character names with 1, 2, and 3-d structures renamed to v[i], v[i,j], and v[i,j,k]
.repair_names <- function(.names) {
.names %>%
# reformat 1-D parameters -> parameter[i]
purrr::map_if(.p = ~stringr::str_detect(.x, pattern = '^[^\\\\.]+\\.?\\d+$'),
.f = ~stringr::str_replace(.x, pattern = '\\.(\\d+)$', replacement = '[\\1]')) %>%
# reformat 2-D parameters -> parameter[i,j]
purrr::map_if(.p = ~stringr::str_detect(.x, pattern = '^[^\\\\.]+\\.\\d+\\.\\d+$'),
.f = ~stringr::str_replace(.x, pattern = '\\.(\\d+)\\.(\\d+)$', replacement = '[\\1,\\2]')) %>%
# reformat 3-D parameters -> parameter[i,j,k]
purrr::map_if(.p = ~stringr::str_detect(.x, pattern = '^[^\\\\.]+\\.\\d+\\.\\d+\\.\\d+$'),
.f = ~stringr::str_replace(.x, pattern = '\\.(\\d+)\\.(\\d+)\\.(\\d+)$', replacement = '[\\1,\\2,\\3]')) %>%
as.character()
}
I'm pretty sure there is a slightly more elegant regex one could use, but I decided to forgo that complexity for now.
Thanks, this is very helpful! Sorry I haven't gotten back to this one yet.
At some point it might be good for spread/gather_draws to support more custom column names inputs so you don't have to do this manually. A simple change that might fix this is an analog to the sep
argument... currently setting sep = "[.]"
in spread_draws() would get you halfway there, maybe a solution would be to add a similar open_bracket
and close_bracket
, or a more generic way to build up these expressions using regexes.
Here's a simpler workaround. I don't know whether it works in all cases, but in my case this was sufficient to solve the problem.
Assuming "stanfit" is a stanfit object created by rstan::read_stan_csv from a cmdstan csv-file:
for(currentChainId in seq_len(length(stanfit@sim$samples))) # for each chain ...
{
names(stanfit@sim$samples[[currentChainId]])= # ... rename variables to restore tidybayes compatibility.
names(stanfit)
}
First, thanks for putting together a well-thought-out package. Very useful functions here to support common workflows.
I'm seeing a problem using
spread_draws
with a fit loaded from CmdStan usingrstan::read_stan_csv()
.Specifically, in the context of my code:
And, when using a fit called
m2
similar to that in your ABC_fit vignette (see below for reproducible example):What I think is going on
It looks like the stanfit object is slightly differently formatted when fit using CmdStan+rstan::read_stan_csv vs those fit using Rstan::sampling. This may be an rstan bug rather than a tidybayes bug, but it impacts tidybayes more than Rstan.
In any case, this causes the data.frame returned by
tidy_draws()
to name the variables likename_of_param.index
instead ofname_of_param[[index]]
. Sincespread_draws
is looking for variables using a regex like the following"^(e_beta)\\[(.+)\\]$"
none of these variables is found byspread_draws_long_
.For example:
Reproducible example
Here is a reproducible example based on your vignette (see full code & sampled chains in this gist).
Apologies, the output is somewhat verbose.
In the first part I set up the environment & fit the ABC model as described in your vignette Using tidy data with Bayesian Models.
None of the above is that surprising -- everything works as expected.
Next we try the same fit except using CmdStan to fit the samples.
SessionInfo
And I'm using cmdstan at stan-dev/cmdstan@6bdc8ba