Open JorisDeRidder opened 1 year ago
That's a very good idea, and it should be possible. Maybe you could try implementing it for one parameter based on the posterior samples?
General recipe:
The approach can in principle give you multimodal regions, which is fine or not depending on what you want. If you do not want it multimodal, you would need to add in also the bins in between in step (3).
Step 1 may also not be trivial.
Another approach is to use a kernel-density estimation library (fastkde, getdist, etc) to do the job. I am not sure which one, perhaps arviz also has an implementation.
I'd be interested to hear how you solve this.
Your pointer to arviz proved to be the quickest way to obtain a HDI. An example for other users. I'm assuming the following imports
import numpy as np
import pandas as pd
import arviz as az
import xarray as xr
and that you ran UltraNest:
sampler = ultranest.ReactiveNestedSampler(param_names, my_loglikelihood, my_prior_transform, derived_param_names)
results = sampler.run()
To convert the UltraNest output to an Arviz InferenceData object I used
results_df = pd.DataFrame(data=results['samples'], columns=results['paramnames'])
results_df["chain"] = 0
results_df["draw"] = np.arange(len(results_df), dtype=int)
results_df = results_df.set_index(["chain", "draw"])
xdata = xr.Dataset.from_dataframe(results_df)
trace = az.InferenceData(posterior=xdata)
which is quite likely not the shortest way. :-)
A 95% HDI interval can then be obtained using Arviz built-in hdi()
function:
hdi = az.hdi(trace, hdi_prob=0.95)
for name in results['paramnames']:
print(name, ": ", hdi[name].values)
leading in my example to
intercept : [0.77835208 1.14636967]
alpha : [1.07772564 1.18964846]
sigma : [0.27528108 0.40888707]
slope : [1.84139496 2.47128276]
@JorisDeRidder btw, how do you get the highest density point though?
You mean the MAP? Some software packages (e.g. PyMC I believe) simply numerically maximize the logPosterior = logLikelihood + logPrior function. It works well if your posterior is monomodal. If that's not the case, you might want to consider constructing a KDE of your posterior and determine the MAP from that.
Here is an approach based on getdist, which supports bounds: https://gist.github.com/JohannesBuchner/2027d0f313521387c2cded2424cdcfeb
Bayesian credible intervals can currently be derived using quantiles. Would it be possible to implement an HDI interval as well?