stan-dev / cmdstanpy

CmdStanPy is a lightweight interface to Stan for Python users which provides the necessary objects and functions to compile a Stan program and fit the model to data using CmdStan.
BSD 3-Clause "New" or "Revised" License
149 stars 67 forks source link

fit.summary() doesn't work with percentiles #760

Closed bob-carpenter closed 1 month ago

bob-carpenter commented 1 month ago

Summary:

fit.summary() fails if I specify a percentiles argument in the same way as suggested in the documentation.

Specifically, this fails:

fit.summary(percentiles=(0.025, 0.975))

Without the percentiles argument it works fine.

Also, if you're going in and modifying this, it'd be nice if it worked with anything array-like. I get a calling error if I pass in [0.024, 0.975].

Description:

CmdStan seems to be failing with the translated call. Traceback (most recent call last): File "/Users/bcarpenter/temp2/nuts-funnel/sim.py", line 13, in print(fit.summary(percentiles=(0.025, 0.975))) File "/opt/homebrew/lib/python3.10/site-packages/cmdstanpy/stanfit/mcmc.py", line 531, in summary do_command(cmd, fd_out=None) File "/opt/homebrew/lib/python3.10/site-packages/cmdstanpy/utils/command.py", line 76, in do_command raise RuntimeError(msg) RuntimeError: Command ['/Users/bcarpenter/.cmdstan/cmdstan-2.34.1/bin/stansummary', '--percentiles= 0.025,0.975', '--sig_figs=6', '--csv_filename=/var/folders/z8/stnk9f1n1_x11wwz_twcqlbw0000gq/T/tmp0jho8bar/stansummary-funnel-akj76g0s.csv', '/var/folders/z8/stnk9f1n1_x11wwz_twcqlbw0000gq/T/tmp0jho8bar/funneleoo1220i/funnel-20240604004339.csv'] error during processing Operation not permitted


#### Additional Information:

Here's my full Python program:

```python
import cmdstanpy as csp
import logging
csp.utils.get_logger().setLevel(logging.ERROR)

model = csp.CmdStanModel(stan_file='funnel.stan')
init = {'double_log_scale': 0, 'alpha': np.zeros(9)}
mass_matrix = {'inv_metric': np.ones(10)}
N = 2
for epsilon in 0.001 * np.sqrt(2)**np.arange(10):
    print(f"\n\n epsilon={epsilon:6.4f}")
    fit = model.sample(inits=init, chains=1, step_size=epsilon, iter_warmup=0, adapt_engaged=False, iter_sampling=10_000, metric=mass_matrix, show_progress=False) 
    print(fit.summary(percentiles=(0.025, 0.975)))

And here's the Stan model:

parameters {
  real double_log_scale;
  vector[9] alpha;
}
model {
  double_log_scale ~ normal(0, 3);
  alpha ~ normal(0, exp(double_log_scale / 2));
}

Current Version:

>>> csp.show_versions()
INSTALLED VERSIONS
---------------------
python: 3.10.8 (main, Oct 13 2022, 09:48:40) [Clang 14.0.0 (clang-1400.0.29.102)]
python-bits: 64
OS: Darwin
OS-release: 23.5.0
machine: arm64
processor: arm
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
cmdstan_folder: /Users/bcarpenter/.cmdstan/cmdstan-2.34.1
cmdstan: (2, 34)
cmdstanpy: 1.2.2
pandas: 2.0.1
xarray: None
tqdm: 4.65.0
numpy: 1.23.4
WardBrian commented 1 month ago

percentiles must be a list of integers between 1 and 99, not floats as you've supplied here. I believe this is enforced by cmdstan upstream, not by us, though I would expect a better error

bob-carpenter commented 1 month ago

This is actually weirder than that. For a start, apparently they're expecting percentiles rather than quantiles, which seems like a bad decision.

>>> print(fit.summary(percentiles=(2.5, 98)))
                      Mean      MCSE    StdDev      2.5%       98%    N_Eff  N_Eff/s     R_hat
lp__             -6.222420  1.960370  10.23330 -21.80770  14.70670  27.2491  2.31710  1.027320

What I originally tried works (though it's not what I wanted):

>>> print(fit.summary(percentiles=(.25, .97)))
                      Mean      MCSE    StdDev     0.25%     0.97%    N_Eff  N_Eff/s     R_hat
lp__             -6.222420  1.960370  10.23330 -23.88160 -22.67830  27.2491  2.31710  1.027320

What doesn't work is writing a real number the way I was expecting.

>>> print(fit.summary(percentiles=(0.025, 0.975)))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/homebrew/lib/python3.10/site-packages/cmdstanpy/stanfit/mcmc.py", line 531, in summary
    do_command(cmd, fd_out=None)
  File "/opt/homebrew/lib/python3.10/site-packages/cmdstanpy/utils/command.py", line 76, in do_command
    raise RuntimeError(msg)
RuntimeError: Command ['/Users/bcarpenter/.cmdstan/cmdstan-2.34.1/bin/stansummary', '--percentiles= 0.025,0.975', '--sig_figs=6', '--csv_filename=/var/folders/z8/stnk9f1n1_x11wwz_twcqlbw0000gq/T/tmpjcyx3buv/stansummary-funnel-8584o2wb.csv', '/var/folders/z8/stnk9f1n1_x11wwz_twcqlbw0000gq/T/tmpjcyx3buv/funnel1pcb56n0/funnel-20240604110246.csv']
    error during processing Operation not permitted

Definitely a bug in CmdStan. I'm not going to report there.

WardBrian commented 1 month ago

When I try those percentiles I get the message Option --percentiles 0.025,0.975: values must be in range (0.1,99.9), inclusive, and strictly increasing.

bob-carpenter commented 1 month ago

probably this: cmdstan: (2, 34); cmdstanpy: 1.2.2