Closed mitzimorris closed 2 weeks ago
@avehtari @jgabry do we also want to implement the MCSE described in section 4.4 ? has this already been done in R? which package and where?
also pings to @aleksgorica and @SteveBronder.
plugging this in to CmdStan's stansummary is trivial, cf: https://github.com/stan-dev/cmdstan/compare/develop...feature/1263-new-rhat-summary
@avehtari @jgabry do we also want to implement the MCSE described in section 4.4 ? has this already been done in R? which package and where?
Yeah bulk and tail ESS are included in the standard diagnostics computed by the posterior package:
https://github.com/stan-dev/posterior/blob/master/R/convergence.R
@avehtari @jgabry do we also want to implement the MCSE described in section 4.4 ? has this already been done in R? which package and where?
Yeah bulk and tail ESS are included in the standard diagnostics computed by the posterior package:
And the implementation for the MCSE for quantiles (Section 4.4 in the paper) is in .mcse_quantile
function
https://github.com/stan-dev/posterior/blob/18c915e540e7578bb69830f3d544e1cfcae26b72/R/convergence.R#L418
is current ESS computation using autocorrelation? comment from @syclik, added 12 years ago remains: https://github.com/stan-dev/stan/blob/cb7fe1f6c4fdabb03d56b299c1424ef3f847a26b/src/stan/mcmc/chains.hpp#L561
the answer is yes it is. comment should be removed.
I don't know about the C++ version, but the ESS calculations in the posterior package that @avehtari and I linked to definitely do use it.
On Tue, Jul 16, 2024 at 12:51 PM Mitzi Morris @.***> wrote:
is current ESS computation using autocorrelation? comment from @syclik https://github.com/syclik, added 12 years ago remains: https://github.com/stan-dev/stan/blob/cb7fe1f6c4fdabb03d56b299c1424ef3f847a26b/src/stan/mcmc/chains.hpp#L561
— Reply to this email directly, view it on GitHub https://github.com/stan-dev/stan/issues/3299#issuecomment-2231605311, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3PQQ2C2JZOQ6OKZDTQTV3ZMVTTNAVCNFSM6AAAAABKZTNB7CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMZRGYYDKMZRGE . You are receiving this because you were mentioned.Message ID: @.***>
I agree that that FIXME comment should be removed
since we don't know how much of the API in chains.hpp
is being used elsewhere, I think that we should re-implement it as a new class chainset.hpp
, which takes as its underlying data structure a std::vector\<Eigen MatrixXd> to hold the chains, where all chains are the same size and shape, with matching column names, and do not contain any warmup draws. enforcing these constraints at object instantiation will allow us to get rid of a lot of repeated code across the various functions which do these checks. at some future point, we could deprecate chains.hpp
.
further refactoring of the code in stan/analyze/mcmc: the rank-normalized split rhat added to compute_potential_scale_reduction.hpp
should be split out into a separate files for rank-normalization, bulk and tail rhat, and bulk and tail ess.
@WardBrian, @SteveBronder thoughts?
I'm still not sure we need something like chains.hpp
at all. If we had analysis functions that took as an argument std::vector<Eigen::MatrixXd> chains
, I suspect that is all we really need. Adding this object whose only job is to call the other functions seems like more code for little benefit, unless we use the object abstraction in some other way as well that I've missed
I'm still not sure we need something like
chains.hpp
at all. If we had analysis functions that took as an argumentstd::vector<Eigen::MatrixXd> chains
, I suspect that is all we really need.
generally agree. the added functionality needed is consistency checking across chains, indexing into this object by column name, and ensuring that variance calculations are done on columns of finite, non-identical values. but yes, the analysis functions should take as arguments std::vector<Eigen::MatrixXd> chains
plus column index/name and return the computed statistic/diagnostic for that column across all chains.
Summary:
PR https://github.com/stan-dev/stan/pull/3266 implemented the rank-normalization and folding needed for Rhat-bulk and Rhat-tail. We should also add logic for ESS-bulk and ESS-tail, per Vehtari et al https://arxiv.org/abs/1903.08008.
Description:
Bulk ESS and tail ESS are available in R, but not in CmdStan or CmdStanPy. We need to add this to Stan so that all interfaces are providing the same estimates given a sample.
Bulk ESS can be computed from the rank normalized draws, per section 4.1
Tail ESS is defined in section 4.3
Current Version:
v2.35.0