joshspeagle / dynesty

Dynamic Nested Sampling package for computing Bayesian posteriors and evidences
https://dynesty.readthedocs.io/
MIT License
347 stars 76 forks source link

log(z) is incorrect when some initial samples have log(L)=-inf #412

Closed segasai closed 1 year ago

segasai commented 1 year ago

While thinking of refactoring the initialize_live_points here https://github.com/joshspeagle/dynesty/blob/86ec8fdc68892a58c0cbbfab9bbdfff5e464853f/py/dynesty/dynamicsampler.py#L333

I noticed that the function behaves incorrectly when none of the initial samples are finite. Because then we just try 100 times till some of the samples are, but that is essentially reducing the volume we are dealing with from unity. But that's not taken into account anywhere.

Here's a demonstration where a truncated in the wings gaussian which is supposed to integrate to logz=0 gives a logz=6.2+/.2 This was tuned to make sure none of the initial samples have finite logl

import numpy as np
import dynesty

nlive = 100
ndim = 2
volfrac = 0.001
r0 = 5

size = r0 / volfrac**(1. / ndim)
assert (size > r0)

def loglike_inf(x):
    r2 = np.sum(x**2)
    r = np.sqrt(r2)

    if r > r0:
        ret = -2000 - 0.001 * r2  # that works
        ret = -np.inf
    else:
        ret = -0.5 * r2
    lnorm = -ndim / 2. * np.log(2 * np.pi) + ndim * np.log(2 * size)
    # first factor is gaussian norm, second is from the prior to integrate to 1
    return ret + lnorm

def prior_transform(x):
    return (2 * x - 1) * size

def testx():
    rstate = np.random.default_rng(1)
    sampler = dynesty.NestedSampler(loglike_inf,
                                    prior_transform,
                                    ndim,
                                    nlive=nlive,
                                    rstate=rstate)
    sampler.run_nested(print_progress=True)
    res = sampler.results
    print(res.logz[-1], res.logzerr[-1])

if __name__ == '__main__':
    testx()

If I change the logz=-inf to something very negative but finite it immediately works again giving logz=-0.19

Somewhat to my surprise when I change the volfrac to 0.1 we still get a biased result of logz=1.6+/.0.1 That suggests me there is another problem. I think that must be related to the fact that all the points are assigned the same value of -1e300.

I don't have an immediate resolution to this problem, but I think this is a major issue.