StanJulia / StanSample.jl

WIP: Wrapper package for the sample method in Stan's cmdstan executable.
MIT License
18 stars 4 forks source link

`read_samples` only reads chain 1 with `use_cpp_chains=true` #65

Closed apintar closed 1 year ago

apintar commented 1 year ago

Here is a minimal working example:

using StanSample
using MCMCChains

set_cmdstan_home!("/your/path/to/cmdstan")

mwe_model = "
parameters {
    real y;
}
model {
    y ~ normal(0.0, 1.0);
}
"

sm_mwe= SampleModel("mwe_model", mwe_model, joinpath(pwd(), "tmp"))

rc_mwe = stan_sample(sm_mwe; num_chains=5, use_cpp_chains=true)

if success(rc_mwe)
    post_samps_mwe = read_samples(sm_mwe, :mcmcchains)
end

post_samps_mwe.value[1:5, 1, :]

You should notice that all 5 chains in post_samps_mwe are identical, and if you examine the CSV files output by the call to stan_sample you should see that chain 1 was read in 5 times.

I believe the issue is with the function read_csv_files in the source file read_csv_files.jl.

In that function, I tried changing the line

csvfile = output_base*name_base*"_$(i).csv"

to

csvfile = output_base*name_base*"_$(k).csv"

It fixes the problem when use_cpp_chains=true, but then the problem appears for use_cpp_chains=false

goedman commented 1 year ago

Thank you! I'll take a look.

goedman commented 1 year ago

Hi @apintar Thanks again for spotting this. I'll chase back in time when/why above line stopped working correctly. It must have worked when I did the latest performance comparisons with RedCardsStudy about a year ago. I would be very interested to know if you find improvements in speed between using CPP or Julia chains. But you might hav other reasons to look into using CPP.

I think above line should be csvfile = output_base*name_base*"_$(i + k - 1).csv". StanSample v6.13.4 includes this fix and was just merged. Might take a bit (maybe 30 mins, but sometimes much quicker or a bit longer) before visible on all servers. If you have StanSample dev-ed you should be able to try it using the latest master. I've added tests in the test_apinter subdirectory of test

If this works fine, I'll add an item to the versions list in README for your contribution. Thanks again.

apintar commented 1 year ago

@goedman Thank you! Seems to be working as expected now.

I would be very interested to know if you find improvements in speed between using CPP or Julia chains. But you might hav other reasons to look into using CPP.

I haven't done much testing. I'm used to using Stan/CPP chains with R, Python, and CmdStan since it avoids the overhead of starting a separate process and copying data. That advantage may be lost in Julia because of Julia's threading capabilities.

goedman commented 1 year ago

Thanks @apintar

Great, I'll leave this issue open for a few more days to see if anything pops up.

At some point I would like to simplify this chain creation/handling part.