stan-dev / cmdstan

CmdStan, the command line interface to Stan
https://mc-stan.org/users/interfaces/cmdstan
BSD 3-Clause "New" or "Revised" License
209 stars 93 forks source link

id for each chain should be unique in multi chain output csvs #1257

Open SteveBronder opened 5 months ago

SteveBronder commented 5 months ago

Summary:

Right now the id for each chain will always be the same in the output csv, but they should be different for each output file

Description:

Describe the issue as clearly as possible.

Reproducible Steps:

make examples/bernoulli/bernoulli
./examples/bernoulli/bernoulli sample num_chains=2 data file="./examples/bernoulli/bernoulli.data.R"

Current Output:

For model_2.csv as an example

# stan_version_major = 2
# stan_version_minor = 34
# stan_version_patch = 1
# model = bernoulli_model
# start_datetime = 2024-03-14 19:00:54 UTC
# method = sample (Default)
#   sample
#     num_samples = 1000 (Default)
#     num_warmup = 1000 (Default)
#     save_warmup = 0 (Default)
#     thin = 1 (Default)
#     adapt
#       engaged = 1 (Default)
#       gamma = 0.05 (Default)
#       delta = 0.8 (Default)
#       kappa = 0.75 (Default)
#       t0 = 10 (Default)
#       init_buffer = 75 (Default)
#       term_buffer = 50 (Default)
#       window = 25 (Default)
#       save_metric = 0 (Default)
#     algorithm = hmc (Default)
#       hmc
#         engine = nuts (Default)
#           nuts
#             max_depth = 10 (Default)
#         metric = diag_e (Default)
#         metric_file =  (Default)
#         stepsize = 1 (Default)
#         stepsize_jitter = 0 (Default)
#     num_chains = 2
# id = 1 (Default)
# data

Expected Output:

For model_2.csv

# stan_version_major = 2
# stan_version_minor = 34
# stan_version_patch = 1
# model = bernoulli_model
# start_datetime = 2024-03-14 19:00:54 UTC
# method = sample (Default)
#   sample
#     num_samples = 1000 (Default)
#     num_warmup = 1000 (Default)
#     save_warmup = 0 (Default)
#     thin = 1 (Default)
#     adapt
#       engaged = 1 (Default)
#       gamma = 0.05 (Default)
#       delta = 0.8 (Default)
#       kappa = 0.75 (Default)
#       t0 = 10 (Default)
#       init_buffer = 75 (Default)
#       term_buffer = 50 (Default)
#       window = 25 (Default)
#       save_metric = 0 (Default)
#     algorithm = hmc (Default)
#       hmc
#         engine = nuts (Default)
#           nuts
#             max_depth = 10 (Default)
#         metric = diag_e (Default)
#         metric_file =  (Default)
#         stepsize = 1 (Default)
#         stepsize_jitter = 0 (Default)
#     num_chains = 2
# id = 2
# data

Additional Information:

I swear I did this in the past, but I think what we need to do is pass an iterator for the chain number to parser.print so that when that function is printing the id we can override the value to be the chain id from within the stan program

Current Version:

v2.34.1

WardBrian commented 5 months ago

This is really another variant of the bug (https://github.com/stan-dev/cmdstan/issues/1029) where if the fixed_param sampler is run when the model lacks parameters, it still reports running HMC: the output here is always the command line as requested, not the actual run information

WardBrian commented 3 months ago

This same behavior also led to some confusion on the forums due to the inits argument being reported as the same in each file: https://discourse.mc-stan.org/t/cmdstanpy-supplying-multiple-paths-as-inits-writes-to-tmp/34968