Highest Posterior Density

ericmjl / bayesian-stats-modelling-tutorial

How to do Bayesian statistical modelling using numpy and PyMC3

MIT License

655 stars 281 forks source link

Highest Posterior Density #87

Open hugobowne opened 4 years ago

hugobowne commented 4 years ago

how do you like to describe HPD @ericmjl @justinbois @betanalpha ?

justinbois commented 4 years ago

http://bebi103.caltech.edu.s3-website-us-east-1.amazonaws.com/2020b/content/lecture_notes/lecture_04/posterior_summaries.html

From that, "If we’re considering a 95% credible interval, the HPD interval is the shortest interval that contains 95% of the probability mass of the posterior."

ericmjl commented 4 years ago

I was struggling for the right words, thanks @justinbois!

betanalpha commented 4 years ago

I don't describe the HPD at all because it's not a well-defined probabilistic object, i.e. it can't be defined as expectation values (or differences of expectation values in the case of quantiles) and hence can't be put into a decision theoretic framework. The subtlety in @justinbois's description is that "shortest" is not well-defined and will in general change when you reparameterize the ambient space. For me the additional computational problem with HPD estimators in the case of nontrivial target distributions makes the concept to confusing to be worth considering at all.

ericmjl commented 4 years ago

What if we reported just the interval from the 2.5th to the 97.5th percentile instead?

justinbois commented 4 years ago

Yes, @ericmjl , that's what @betanalpha advocates, and I usually do, too. The percentiles will not change for any change of variables for which the function used in changing the variables is monotonic.

betanalpha commented 4 years ago

To be honest 2.5% and 97.5% require way too many effective samples to be reasonable defaults. If you need 100 effective samples below the 2.5% quantile to estimate it with any decent precision then you need about 4000 effective samples total, which requires far more work than needed for most other estimation goals. When using intervals I just default to 10%-90%, and then if showing nested intervals continuing on to 20%-80%, 30%-70%, 40%-60%, and then the median. If I just need to summarize a one-dimensional marginal then I'll usually use a histogram, looking at the ECDF if there's any suspicion of odd effects like repeated values.