Open bob-carpenter opened 7 years ago
I wrote some Python (from scratch it's not PyStan) to do the conversion, mainly to automate simulation + sampling for the same model when using CmdStan. It could be a useful starting point, and though you wouldn't want me to port to c++ myself, it only uses NumPy so it should be quite portable.
Thanks, @maedoc.
Now that you mention it, RStan must have all the pieces of this implemented because of the way extract()
and stan_rdump()
work.
Sure but it's all in R code.
Is this still relevant? Is a converter from the cmdstan CSV output to R dump still needed? My guess would be no.
Something like this is needed for restarts, but I think that'd require a new command.
Is this still relevant? Is a converter from the cmdstan CSV output to R dump still needed? My guess would be no.
a few use cases I've wanted this for
I usually end up with a mess of grep, cut, tr, nl in bash for what is a pretty simple job. Two main modes would be
It'd also be useful to massage CSV to convert matrices from x.1.2 style columns to 2D ascii matrices for use with GnuPlot or similar, but that's fairly outside scope.
Is input/output in JSON now part of CmdStan? That seems like the easiest way to go. I could give it a go, since it'd be miles better than the bash equivalent.
Once https://github.com/stan-dev/cmdstanr/pull/95 is merged to cmdstanr, you will be able to read the samples and all sampler parameters (diviergent, leapfrog, etc.. ) with read_sample_csv(filenames)
in R. It outputs the following list:
list(
sampling_info
inverse_mass_matrix
warmup
post_warmup
warmup_sampler
post_warmup_sampler
)
I think this is close to what you are looking for. You cant read existing cmdstan csv files, no need to run model through cmdstanr if you dont want to.
If you feel more at home with Python then try check_sampler_csv from cmdstanpy. I think it does something similar.
Input in JSON has been a part of Cmdstan for quite some time, we just made the input a bit faster for the last release. The ouput is still csv only however.
I'm aware of the R/Py interfaces to CmdStan as well, but was hoping to stick with a plain Bash/Makefile setup but I think for complex workflows that's just not realistic. Munging data formats on the command line is precarious esp for matrix/array datatypes.
On Dec 13, 2019, at 4:39 AM, marmaduke woodman notifications@github.com wrote:
I'm aware of the R/Py interfaces to CmdStan as well,
In case it wasn't clear to our devs not involved in CmdStanPy, the original version was derived from Marmaduke's PyCmdStan package.
Oh haha :) Now I feel like a fool :blush:
Don't feel bad---it's a big project with too much going on for any one person to follow. I'm just trying to close the loops where I see an opportunity.
On Dec 13, 2019, at 10:32 AM, Rok Češnovar notifications@github.com wrote:
Oh haha :) Now I feel like a fool.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
Moved from https://github.com/stan-dev/stan/issues/544
In order to perform fake data simulation or posterior predictive checking, it would be nice to be able to convert the output of a Stan model from CSV format to the input for a Stan model in R dump format.
This should be structured as a command parallel to
bin/print
that does the conversion of an output CSV file. An alternative would be to have a model call argument that would produce R dump output.The manual for CmdStan needs to be updated to show how to use this function. This will enable us to write a chapter in the manual on fake data and posterior predictive checks.
Be careful about type of the columns --- if there are integer generated quantities, the output can be integers.
For example, for the Bernoulli model in the introduction, a fake-data generator should look like:
Related issues:
N
in order to provide input for this model