Closed mitzimorris closed 1 month ago
@SteveBronder - I would really appreciate your eyeballs on the ESS calculations - a preliminary review?
@WardBrian - does this API look OK?
also @avehtari - I implemented tail ESS - current logic is that if tail ESS returns NaN, set if to bulk ESS, as this seems to be what the posterior package does. is this correct?
the current set of unit tests test split R-hat and split-ESS against what's in posterior for a run of 2 chains on the eight_schools model. suggestions for further tests welcome.
I think the new APIs look nice. I still have some concerns about the implementations, especially things like sample means vs population means, and how difficult it is to tell them apart just reading the code. For example, split_rank_normalized_rhat.hpp has
chain_mean(i) = chain_col.mean();
and
double mean_var = math::mean(chain_var);
just a few lines apart -- are these the same mean as each other? I think in this case they are, but it's worth making obvious. Same for variances. A few small named functions for these would go a long way, I think
also @avehtari - I implemented tail ESS - current logic is that if tail ESS returns NaN, set if to bulk ESS, as this seems to be what the posterior package does. is this correct?
In posterior package if tail ESS is NaN, then it's NaN. Can you provide a link to the posterior package code which made you think otherwise?
In posterior package if tail ESS is NaN, then it's NaN. Can you provide a link to the posterior package code which made you think otherwise?
you're correct - this is what posterior will do. I'm puzzled as to why the implementation in this PR fails on a test of the eight schools model and data, run with 2 chains, 1000 iters each but thinned by a factor of 2. the parameter tau
has a low effective bulk ESS, and CmdStanR reports bulk and tail having the same value. for the implementation in this PR, the tail 05 ESS is NaN, and the tail 95 ESS is 486. In CmdStanR, the tail ESS is 71 - same as bulk ESS. more testing needed.
update: caused by differences in computing quantiles between Eigen and R.
@SteveBronder and @WardBrian - ready for first round of review
Name | Old Result | New Result | Ratio | Performance change( 1 - new / old ) |
---|---|---|---|---|
arma/arma.stan | 0.42 | 0.35 | 1.2 | 16.64% faster |
low_dim_corr_gauss/low_dim_corr_gauss.stan | 0.01 | 0.01 | 0.92 | -9.29% slower |
gp_regr/gen_gp_data.stan | 0.03 | 0.03 | 1.19 | 16.31% faster |
gp_regr/gp_regr.stan | 0.11 | 0.11 | 1.04 | 3.7% faster |
sir/sir.stan | 74.54 | 65.99 | 1.13 | 11.47% faster |
irt_2pl/irt_2pl.stan | 4.47 | 3.97 | 1.13 | 11.28% faster |
eight_schools/eight_schools.stan | 0.05 | 0.05 | 0.98 | -2.1% slower |
pkpd/sim_one_comp_mm_elim_abs.stan | 0.28 | 0.25 | 1.12 | 10.53% faster |
pkpd/one_comp_mm_elim_abs.stan | 20.98 | 18.67 | 1.12 | 11.05% faster |
garch/garch.stan | 0.49 | 0.41 | 1.19 | 15.76% faster |
low_dim_gauss_mix/low_dim_gauss_mix.stan | 2.96 | 2.62 | 1.13 | 11.36% faster |
arK/arK.stan | 1.89 | 1.74 | 1.09 | 7.84% faster |
gp_pois_regr/gp_pois_regr.stan | 3.07 | 2.76 | 1.11 | 10.24% faster |
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan | 9.52 | 8.51 | 1.12 | 10.67% faster |
performance.compilation | 212.3 | 210.73 | 1.01 | 0.74% faster |
Mean result: 1.0978070902051777
Jenkins Console Log Blue Ocean Commit hash: 2cd95be39b0482d0fedb7fe9a1e8c77b7084592a
Name | Old Result | New Result | Ratio | Performance change( 1 - new / old ) |
---|---|---|---|---|
arma/arma.stan | 0.36 | 0.32 | 1.11 | 9.72% faster |
low_dim_corr_gauss/low_dim_corr_gauss.stan | 0.01 | 0.01 | 0.84 | -18.91% slower |
gp_regr/gen_gp_data.stan | 0.02 | 0.02 | 1.06 | 5.72% faster |
gp_regr/gp_regr.stan | 0.1 | 0.1 | 0.97 | -2.68% slower |
sir/sir.stan | 68.89 | 73.54 | 0.94 | -6.74% slower |
irt_2pl/irt_2pl.stan | 4.24 | 4.2 | 1.01 | 0.92% faster |
eight_schools/eight_schools.stan | 0.06 | 0.05 | 1.05 | 4.7% faster |
pkpd/sim_one_comp_mm_elim_abs.stan | 0.25 | 0.23 | 1.05 | 5.02% faster |
pkpd/one_comp_mm_elim_abs.stan | 19.58 | 18.62 | 1.05 | 4.9% faster |
garch/garch.stan | 0.43 | 0.42 | 1.02 | 2.27% faster |
low_dim_gauss_mix/low_dim_gauss_mix.stan | 2.74 | 2.61 | 1.05 | 4.85% faster |
arK/arK.stan | 1.79 | 1.7 | 1.05 | 5.12% faster |
gp_pois_regr/gp_pois_regr.stan | 2.91 | 2.87 | 1.01 | 1.26% faster |
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan | 8.91 | 8.39 | 1.06 | 5.8% faster |
performance.compilation | 182.9 | 179.17 | 1.02 | 2.04% faster |
Mean result: 1.0204192093196358
Jenkins Console Log Blue Ocean Commit hash: 2cd95be39b0482d0fedb7fe9a1e8c77b7084592a
Name | Old Result | New Result | Ratio | Performance change( 1 - new / old ) |
---|---|---|---|---|
arma/arma.stan | 0.46 | 0.32 | 1.43 | 30.03% faster |
low_dim_corr_gauss/low_dim_corr_gauss.stan | 0.01 | 0.01 | 0.92 | -8.3% slower |
gp_regr/gen_gp_data.stan | 0.03 | 0.02 | 1.4 | 28.52% faster |
gp_regr/gp_regr.stan | 0.1 | 0.1 | 1.06 | 5.91% faster |
sir/sir.stan | 69.47 | 70.42 | 0.99 | -1.36% slower |
irt_2pl/irt_2pl.stan | 4.25 | 4.09 | 1.04 | 3.8% faster |
eight_schools/eight_schools.stan | 0.06 | 0.05 | 1.13 | 11.33% faster |
pkpd/sim_one_comp_mm_elim_abs.stan | 0.25 | 0.23 | 1.05 | 4.35% faster |
pkpd/one_comp_mm_elim_abs.stan | 19.49 | 18.64 | 1.05 | 4.38% faster |
garch/garch.stan | 0.44 | 0.41 | 1.07 | 6.73% faster |
low_dim_gauss_mix/low_dim_gauss_mix.stan | 2.7 | 2.59 | 1.04 | 4.14% faster |
arK/arK.stan | 1.78 | 1.72 | 1.03 | 3.24% faster |
gp_pois_regr/gp_pois_regr.stan | 2.88 | 2.72 | 1.06 | 5.77% faster |
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan | 8.74 | 8.42 | 1.04 | 3.71% faster |
performance.compilation | 179.68 | 182.68 | 0.98 | -1.67% slower |
Mean result: 1.0860988235048856
Jenkins Console Log Blue Ocean Commit hash: 2cd95be39b0482d0fedb7fe9a1e8c77b7084592a
Name | Old Result | New Result | Ratio | Performance change( 1 - new / old ) |
---|---|---|---|---|
arma/arma.stan | 0.43 | 0.32 | 1.34 | 25.39% faster |
low_dim_corr_gauss/low_dim_corr_gauss.stan | 0.01 | 0.01 | 0.92 | -9.23% slower |
gp_regr/gen_gp_data.stan | 0.02 | 0.02 | 1.09 | 7.89% faster |
gp_regr/gp_regr.stan | 0.1 | 0.11 | 0.92 | -8.19% slower |
sir/sir.stan | 69.1 | 70.56 | 0.98 | -2.12% slower |
irt_2pl/irt_2pl.stan | 4.31 | 4.02 | 1.07 | 6.75% faster |
eight_schools/eight_schools.stan | 0.06 | 0.08 | 0.75 | -33.51% slower |
pkpd/sim_one_comp_mm_elim_abs.stan | 0.26 | 0.24 | 1.08 | 7.44% faster |
pkpd/one_comp_mm_elim_abs.stan | 19.84 | 18.65 | 1.06 | 5.99% faster |
garch/garch.stan | 0.44 | 0.41 | 1.08 | 7.62% faster |
low_dim_gauss_mix/low_dim_gauss_mix.stan | 2.76 | 2.61 | 1.06 | 5.33% faster |
arK/arK.stan | 1.78 | 1.72 | 1.04 | 3.39% faster |
gp_pois_regr/gp_pois_regr.stan | 3.08 | 2.8 | 1.1 | 9.34% faster |
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan | 8.94 | 8.4 | 1.06 | 5.98% faster |
performance.compilation | 180.35 | 183.72 | 0.98 | -1.86% slower |
Mean result: 1.0355232469724303
Jenkins Console Log Blue Ocean Commit hash: 2cd95be39b0482d0fedb7fe9a1e8c77b7084592a
@WardBrian or @dylex - upstream tests are failing - is there a Jenkins problem? workspace?
I’m not seeing anything abnormal - latest run failed due to excess compiler warnings
I’m not seeing anything abnormal - latest run failed due to excess compiler warnings
this is what I'm seeing:
continuous-integration/jenkins/pr-merge failed: https://jenkins.flatironinstitute.org/blue/organizations/jenkins/Stan%2FStan/detail/PR-3305/27/pipeline
Yes, because this PR is causing 4 new compiler warnings which is crossing the threshold of what we allow in CI:
thanks - fixed! (this time for sure!)
Name | Old Result | New Result | Ratio | Performance change( 1 - new / old ) |
---|---|---|---|---|
arma/arma.stan | 0.33 | 0.32 | 1.01 | 0.85% faster |
low_dim_corr_gauss/low_dim_corr_gauss.stan | 0.01 | 0.01 | 0.96 | -3.66% slower |
gp_regr/gen_gp_data.stan | 0.02 | 0.02 | 1.03 | 3.14% faster |
gp_regr/gp_regr.stan | 0.1 | 0.1 | 0.99 | -1.2% slower |
sir/sir.stan | 68.95 | 68.27 | 1.01 | 0.99% faster |
irt_2pl/irt_2pl.stan | 4.11 | 4.17 | 0.99 | -1.38% slower |
eight_schools/eight_schools.stan | 0.06 | 0.06 | 1.04 | 3.54% faster |
pkpd/sim_one_comp_mm_elim_abs.stan | 0.24 | 0.25 | 0.96 | -3.94% slower |
pkpd/one_comp_mm_elim_abs.stan | 19.63 | 18.85 | 1.04 | 3.95% faster |
garch/garch.stan | 0.44 | 0.41 | 1.09 | 8.32% faster |
low_dim_gauss_mix/low_dim_gauss_mix.stan | 2.74 | 2.61 | 1.05 | 4.71% faster |
arK/arK.stan | 1.78 | 1.75 | 1.02 | 1.61% faster |
gp_pois_regr/gp_pois_regr.stan | 2.84 | 2.81 | 1.01 | 1.27% faster |
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan | 8.89 | 8.43 | 1.05 | 5.16% faster |
performance.compilation | 180.44 | 180.36 | 1.0 | 0.04% faster |
Mean result: 1.0169453997994367
Jenkins Console Log Blue Ocean Commit hash: 2cd95be39b0482d0fedb7fe9a1e8c77b7084592a
Name | Old Result | New Result | Ratio | Performance change( 1 - new / old ) |
---|---|---|---|---|
arma/arma.stan | 0.34 | 0.32 | 1.07 | 6.31% faster |
low_dim_corr_gauss/low_dim_corr_gauss.stan | 0.01 | 0.01 | 0.91 | -10.28% slower |
gp_regr/gen_gp_data.stan | 0.02 | 0.02 | 1.03 | 3.15% faster |
gp_regr/gp_regr.stan | 0.09 | 0.1 | 0.99 | -1.12% slower |
sir/sir.stan | 70.04 | 68.31 | 1.03 | 2.47% faster |
irt_2pl/irt_2pl.stan | 4.2 | 4.31 | 0.97 | -2.7% slower |
eight_schools/eight_schools.stan | 0.06 | 0.06 | 1.03 | 2.94% faster |
pkpd/sim_one_comp_mm_elim_abs.stan | 0.25 | 0.25 | 0.98 | -2.06% slower |
pkpd/one_comp_mm_elim_abs.stan | 19.51 | 18.79 | 1.04 | 3.71% faster |
garch/garch.stan | 0.45 | 0.41 | 1.09 | 8.47% faster |
low_dim_gauss_mix/low_dim_gauss_mix.stan | 2.72 | 2.6 | 1.05 | 4.4% faster |
arK/arK.stan | 1.82 | 1.72 | 1.06 | 5.58% faster |
gp_pois_regr/gp_pois_regr.stan | 2.87 | 2.72 | 1.06 | 5.25% faster |
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan | 8.78 | 8.44 | 1.04 | 3.88% faster |
performance.compilation | 181.33 | 181.18 | 1.0 | 0.08% faster |
Mean result: 1.0224876057703807
Jenkins Console Log Blue Ocean Commit hash: 2cd95be39b0482d0fedb7fe9a1e8c77b7084592a
Name | Old Result | New Result | Ratio | Performance change( 1 - new / old ) |
---|---|---|---|---|
arma/arma.stan | 0.36 | 0.32 | 1.14 | 12.57% faster |
low_dim_corr_gauss/low_dim_corr_gauss.stan | 0.01 | 0.01 | 0.99 | -0.72% slower |
gp_regr/gen_gp_data.stan | 0.03 | 0.02 | 1.2 | 16.83% faster |
gp_regr/gp_regr.stan | 0.1 | 0.09 | 1.04 | 3.42% faster |
sir/sir.stan | 69.38 | 70.3 | 0.99 | -1.31% slower |
irt_2pl/irt_2pl.stan | 4.25 | 3.96 | 1.07 | 6.73% faster |
eight_schools/eight_schools.stan | 0.06 | 0.05 | 1.07 | 6.22% faster |
pkpd/sim_one_comp_mm_elim_abs.stan | 0.25 | 0.25 | 0.98 | -2.37% slower |
pkpd/one_comp_mm_elim_abs.stan | 19.72 | 18.95 | 1.04 | 3.92% faster |
garch/garch.stan | 0.44 | 0.41 | 1.06 | 6.07% faster |
low_dim_gauss_mix/low_dim_gauss_mix.stan | 2.71 | 2.6 | 1.04 | 3.77% faster |
arK/arK.stan | 1.76 | 1.72 | 1.02 | 2.31% faster |
gp_pois_regr/gp_pois_regr.stan | 2.84 | 2.8 | 1.01 | 1.22% faster |
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan | 8.83 | 8.4 | 1.05 | 4.89% faster |
performance.compilation | 180.95 | 180.82 | 1.0 | 0.07% faster |
Mean result: 1.0473007260170133
Jenkins Console Log Blue Ocean Commit hash: 2cd95be39b0482d0fedb7fe9a1e8c77b7084592a
closing this PR - see comment https://github.com/stan-dev/stan/pull/3310#issuecomment-2389091218
Submission Checklist
./runTests.py src/test/unit
make cpplint
Summary
Add split-rank-folded ESS .
Intended Effect
Expose new split-Rhat and split-ESS for CmdStan
How to Verify
unit tests
Side Effects
N/A
Documentation
Copyright and Licensing
Please list the copyright holder for the work you are submitting (this will be you or your assignee, such as a university or company): Columbia University
By submitting this pull request, the copyright holder is agreeing to license the submitted work under the following licenses: