joshspeagle / dynesty

Dynamic Nested Sampling package for computing Bayesian posteriors and evidences
https://dynesty.readthedocs.io/
MIT License
357 stars 77 forks source link

NaNs and Infs in the status #111

Closed JohannesBuchner closed 5 years ago

JohannesBuchner commented 5 years ago

I am having some issues with the following toy problem, basically a Bayesian fourier-analysis.

Firstly, I get NaNs and Infs in the status, which I am sure are not coming from the likelihood:

iter: 22178 | bound: 507 | nc: 545 | ncall: 938419 | eff(%): 2.363 | loglstar: -inf < -55.450 < inf | logz: -76.877 +/- nan | dlogz: 5.547 > 1.009

Secondly, in PyMultinest it gives two peaks, as expected. Dynesty only finds one. I suspect is because multinest implements multi-model nested sampling, i.e., doubles the number of live points when modes split. So I increased the number of live-points to 1000, but that didn't terminate. This makes me wonder if it would make sense to add a dynamic nested sampling policy that adds points near when a mode is lost.

import numpy
from numpy import sin, exp, log, pi
import matplotlib.pyplot as plt

prefix = 'simple_problem'
numpy.random.seed(1)
x = numpy.random.uniform(0, 2 * pi, size=100)
x.sort()
#x = numpy.linspace(0, 2 * pi, 100)
fa = 4.2 / 4
fb = 42 / 10
na = 0.8
nb = 0.1
pa = 0.42
pb = 2.4
sigma = 0.2
ypred = exp(na * sin(x * fa + pa) + nb * sin(x * fb + pb))
y = numpy.random.normal(ypred, sigma)
plt.plot(x, y, 'x ')
plt.plot(x, ypred)
plt.savefig(prefix + "data.pdf", bbox_inches='tight')
plt.close()

ndim = 3 * 2 + 1

def prior_transform(cube):
    params = cube * 100
    params[0] = cube[0]*4-1
    params[1] = cube[1]*4
    params[2] = cube[2]*2*pi
    params[3] = cube[3]*4-1
    params[4] = cube[4]*4
    params[5] = cube[5]*2*pi
    params[6] = cube[6]*2-2
    return params

def loglike(params):
    logna, logfa, pa, lognb, logfb, pb, logsigma = params
    na, fa, pa, nb, fb, pb, sigma = 10**logna, 10**logfa, pa, 10**lognb, 10**logfb, pb, 10**logsigma
    ypred = exp(na * sin(x * fa + pa) + nb * sin(x * fb + pb))
    loglike = -0.5 * (((ypred - y)/sigma)**2 + log(2 * pi * sigma)).sum()
    if not numpy.isfinite(loglike):
        loglike = -1e300
    return loglike

from pymultinest.solve import solve
results = solve(LogLikelihood=loglike, Prior=prior_transform, n_dims=ndim, 
    n_live_points=400,
    outputfiles_basename=prefix + '_out_', verbose=True, resume=True)

import dynesty
sampler = dynesty.NestedSampler(loglike, prior_transform, ndim, bound='multi', nlive=1000)
sampler.run_nested()
results = sampler.results

from dynesty import plotting as dyplot
dyplot.runplot(results)
plt.savefig(prefix + "run.pdf", bbox_inches='tight')
plt.close()

dyplot.traceplot(results)
plt.savefig(prefix + "trace.pdf", bbox_inches='tight')
plt.close()

dyplot.cornerplot(results)
plt.savefig(prefix + "corner.pdf", bbox_inches='tight')
plt.close()
joshspeagle commented 5 years ago

get NaNs and Infs in the status

This isn't a bug, but a result of (1) the approximation I'm using for the NS error breaking and (2) just setting up log-likelihood bounds to be consistent with the DynamicNestedSampler. I haven't managed to add this into the documentation yet. I've been thinking about implementing a slightly more robust algorithm for the evidence error to prevent this, but that's still TBD.

but that didn't terminate

I just want to check if you're using the pip install version or the version on GitHub. It sounds like you might be encountering some problems with the ellipsoid decomposition and/or bootstrapping.

JohannesBuchner commented 5 years ago

I use the version from github.

joshspeagle commented 5 years ago

Okay, so I've looked into this and can replicate the multimodal solution without a problem. Part of the reason for the convergence issues is that a bunch of the solutions are actually on the edge of the prior (like actually right at the edge), which really messes with the bounding ellipsoids I'm using since all the points pile up right next to the boundary. This drastically reduces the sampling efficiency just by sheer RNG, since a ton of proposed points end up exceeding the boundary of the unit cube. This is compounded by the fact that the likelihood in several other parameters is extremely broad (spanning a good chunk of the unit cube), which also gives ellipsoids that exceed the unit cube bounds. So the ellipsoids just don't end up being very good.

Another issue is that during intermediate stages of sampling the distribution takes a while to split into the final modes. This causes some trouble for my default choice of ellipsoid decompositions (which are much more conservative than MultiNest), which lead to low efficiencies.

I fixed both issues by just (1) shifting the prior and (2) using slice sampling, which appears to give the correct results.

joshspeagle commented 5 years ago

This was an interesting toy problem, so I'll plan to add a modified version of this into the set of examples. Thanks!