Improved documentation for weights used for calculation of posterior samples.

joshspeagle / dynesty

Dynamic Nested Sampling package for computing Bayesian posteriors and evidences

MIT License

347 stars 76 forks source link

I use the following code for calculating posterior samples with dynesty:

# sampler is a DynamicNestedSampler
sampler.run_nested()
res = sampler.results
samples, weights = res.samples, np.exp(res.logwt - res.logz[-1])
new_samples = dyfunc.resample_equal(samples, weights)

This was based on code snippets from the dynesty documentation. However, it's not clear to me how the weights calculation np.exp(res.logwt - res.logz[-1]) comes about, and I'm concerned that I'm applying this blindly.

is it always the same calculation to calculate these weights?
would it be possible to explain the weights calculation in an accessible manner? The calculation is effectively dividing the weights by the evidence, but why the last entry in logz?
if it's always the same calculation why isn't there a res.weights attribute that can be used, rather than having to use the same calculation over and over?

I'm a newbie to nested sampling, so these questions may be super simple, but I think that the documentation could be improved in this regard.

is it always the same calculation to calculate these weights?

Yes, although note that you don't technically need to compute the weights directly -- a lot of the dynesty internals purposely use the natural log of the weights for numerical stability, only converting to the weights when necessary at the final stages of a calculation.

would it be possible to explain the weights calculation in an accessible manner? The calculation is effectively dividing the weights by the evidence, but why the last entry in logz?

res.logwt is just the natural log of the weights. So np.exp just converts from logarithmic to linear units. The weights are not normalized to sum to 1, however, but rather the evidence. As you mention, this just means we have to divide by the evidence. We could do this after exponentiation, but for numerical reasons it is more stable to do this inside the exponential in log space. Since logz is an array of the natural log of the evidence estimated at each iteration, we want to use the final value.

if it's always the same calculation why isn't there a res.weights attribute that can be used, rather than having to use the same calculation over and over?

I could have included something like this but just felt like it was just redundant when logwt was already included (and generally preferred for numerical stability). It'd be pretty easy to add in though.

joshspeagle / dynesty

Improved documentation for weights used for calculation of posterior samples. #199