arviz-devs / arviz

Exploratory analysis of Bayesian models with Python
https://python.arviz.org
Apache License 2.0
1.56k stars 387 forks source link

Starting a nested sampling interface #1741

Open grburgess opened 3 years ago

grburgess commented 3 years ago

Tell us about it

In order to switch my software over to storing its output uniformly in InferenceData format, I need to implement a nested sampling converter. So I am looking to begin implementing this, but it would be good to build some consensus on what we would like to extract from the output of the various nested sampling codes.

1) They all have a single chain, so this is simple choice 2) We likely only want to store the weighted samples since there is not a slot for weights in the inference object 3) Dimensionality of parameters? I never use nested sampling for anything but very simple models, so if there is anyone with experience on storing something more complex it would be nice to hear from them.

Thoughts on implementation

It would be nice to do this in a class-hierarchical fashion so that the main interface is something like:


class NestedSamplingConverter(object):

    from_multinest()
    from_dynesty()
    from_ etc()

and this class does the conversion to InferenceData in some common way.

I believe we could build off the emcee interface. We could add some code to resample from the unweighted samples as is done in some codes. Otherwise, the interface will just need to be tuned to grabbing samples and log likelihoods from the various codes.

joergfunger commented 2 years ago

Is there an update on this, we are also interested in visualizing nested sampling results from dynesty with arviz.

grburgess commented 2 years ago

Not from my side, but I am still interested in developing it.

ahartikainen commented 2 years ago

I think currently we have interface for emcee, which uses multichain structure, even when there is no real multichain data. This is not optimal.

What kind of structure the nested sampling algos output?

We could always extend our specification for weighted samples? What should be the location for the weights? Inside the posterior etc group? And if so, should we use some specific name for the weights?

TTitscher commented 2 years ago

I am also interested in plotting nested sampling results, thank you for your response!

What kind of structure the nested sampling algos output?

So far, I looked into Nestle and Dynesty and the important additional output is a (log)weight for each posterior sample. Thus, from what I understood from the InferenceData structure, there could be an additional weight vector in sample_stats.

As for working with weighted samples, especially the KDE algorithms would need adjustments, but there are open source implementations (e.g. lightkde) that could be used, if weights are available.

juehang commented 2 months ago

Have there been any updates on this? In addition to the weight and the samples, nested samplers also return an estimate of the evidence, which one would want to include.