return random point if no MCMC steps accepted

joshspeagle / dynesty

Dynamic Nested Sampling package for computing Bayesian posteriors and evidences

https://dynesty.readthedocs.io/

MIT License

347 stars 76 forks source link

return random point if no MCMC steps accepted #354

Closed ColmTalbot closed 2 years ago

ColmTalbot commented 2 years ago

Hi @joshspeagle @segasai this PR fixes an issue in the rwalk proposal method.

Currently, if no steps are accepted the original point is returned. However, this point will almost always be accepted as there's only a 1 / nlive probability that it doesn't have a higher likelihood than the lowest likelihood live point (by definition).

Having repeated nested samples in the chain can easily lead to subtle biases and overly constrained ellipses.

The proposed fix is to just return a random point from the prior. This will almost certainly not be accepted but communicates to the sampler that no valid point was found.

(FWIW, we've been using a version of this in Bilby for a while.)

segasai commented 2 years ago

Hi @ColmTalbot ,

Thanks for the patch, but I am not sure that's statistically correct. MH random walk requires a very specific proposal distribution to keep the distribution invariant and preserve the detailed balance. The current walker certainly does that (at the cost of possibly repeated points). I am not convinced that your code does that.

Also I agree the behaviour of the rwalk sampler when none of the samples are accepted is potentially problematic, but I am not convinced that your proposed solution is a correct one.

segasai commented 2 years ago

The reason why I'm concerned about this code is because the previous implementation (in 1.1) was exactly incorrect here. And it was showing up in cases where your posterior has very sharp corner on one side vs another, so that the acceptance rate on one side is much lower. You can't just give on on a proposed live point and try another one -- that'll lead to incorrect posterior.

ColmTalbot commented 2 years ago

Hi @segasai thanks for the comment. I appreciate being careful about detailed balance. I'll try to explain in more detail why I don't think that is the correct question here.

This function is supposed to return a new point from the constrained prior distribution. (There are various places in the literature that argue that the new point doesn't have to be strictly statistically independent from the current live points, but absolute I'm glad we agree that exact repetition is problematic.) But there's also another option that is given to us in dynesty. At this line the returned point is checked against the likelihood constraint. This allows us to return a value that essentially means "the proposal method failed, try again".

To achieve this, we want a way to propagate to that line that the MH evolution was unsuccessful (a chain that hasn't accept any points is not a valid chain as it definitely hasn't converged to the correct stationary distribution). The simplest way to do that is to set the returned likelihood to some value less than loglstar. The simplest way to do that would be to do something like

    if n_accept == 0:
        logl = -np.inf

But this is undesirable as it leads to returning a parameter point/likelihood pair that aren't matched. I guess we could set

    if n_accept == 0:
        u = np.array([np.nan] * n)
        v = np.array([np.nan] * n)
        logl = -np.inf

My proposal is just a simple extension of that where we generate a new random point uniformly from the prior. At this stage, we've given up on the attempted MCMC chain and are just doing a hail-mary panic return. There are two possibilities:

most likely the new point will not pass the likelihood threshold and will be rejected.
occasionally the new point will be accepted and we will have added a completely independent point to the set of live points, which is the nested sampling dream!

segasai commented 2 years ago

I agree that we need to communicate that the point evolution was not successful, but the way you propose I believe will not be correct, because the most likely outcome of that will be that the point will be rejected and some other live-point will likely be selected as a new candidate, but that's not correct.

See the example from https://github.com/joshspeagle/dynesty/issues/320 I.e. imagine cases where the acceptance probability from rwalk is small on one side and very large on another. If we fail in rwalk, we at least MUST ensure that we start from the same life-point (maybe after adjusting the parameters of the proposal), but even then is a bit worrisome.

In fact I took your code and ran it using the example from #320

import numpy as np
import dynesty.sampling as S

def logl(x0):
    x, y = x0
    a = 10
    sl = 0.01
    x1 = np.abs(x)
    if x1 > a:
        return -1

    if np.abs(y) < (sl * (x1 - a)**2):
        return 1
    else:
        return -1

def priort(x):
    return x * 20 - 10

logls = 0
axes = np.eye(2) * .1
scale = 1
rstate = np.random.default_rng()
kwargs = {}
u = [.5, .5]
print(logl(priort(np.r_[0.5, 0.5])))
ret = []
for i in range(100000):
    u = S.generic_random_walk(
        u,
        logls,
        axes,
        scale,
        priort,
        logl,
        rstate,
        kwargs,
    )[0]
    ret.append(u)
ret = np.array(ret)

And I'm not getting a uniform distribution inside the diamond like I should

(the existing dynesty code (from master) does behave correctly)

So I think first and foremost I think we should keep generic_random_walk intact because it implements a classical MH step and should do one simple thing. I agree that we may need to consider some logic in _fill_queue() probably for the case when we get repeats, but we very carefully need to think and test it.

(Also I or you can be wrong in these arguments, that why I'd prefer to see a few actual test-cases here, like the one above, so we can actually verify what we are talking about and the code is correct, as this is critical component of the code).

segasai commented 2 years ago

Maybe an even more concrete example.

I'm worried about cases where our live points are distributed like this

xs,ys=2*np.random.uniform(size=(2,200))-1;
ys[xs>0]=ys[xs>0]/30.;
xs[xs>0]=30*xs[xs>0]

with say unit Gaussian proposal. If we allow to not accept gaussian walks from the narrow part of the distribution (because they'll rarely work), than the narrow part of the distribution will just die off, which is not correct ( as it has the same volume).

Obviously this is somewhat a corner case, but I don't think it's that extreme.

coveralls commented 2 years ago

Pull Request Test Coverage Report for Build 1830458021

1 of 1 (100.0%) changed or added relevant line in 1 file are covered.
3 unchanged lines in 1 file lost coverage.
Overall coverage decreased (-0.07%) to 85.939%

Files with Coverage Reduction	New Missed Lines	%
py/dynesty/bounding.py	3	85.06%
<!--	Total:	3		-->

Totals
Change from base Build 1579868198:	-0.07%
Covered Lines:	3496
Relevant Lines:	4068

💛 - Coveralls

ColmTalbot commented 2 years ago

I hadn't seen your second comment while I was writing my response, I'll respond to them in order (if possible) as I can't seem to start threads.

Also, this is diverging more than I expected, so if you want to move this to an issue, that's fine with me.

TL;DR

I agree that when this panic return is called a lot there is likely to be bad results. Running the MCMC for longer makes all the issues go away. Using a better proposal distribution means you don't have to run the MCMC for as long.

Diamond sampling

I think that the issue here is that the proposal function isn't doing enough steps before giving up. I ran a slightly modified version (to throw out the invalid points) while increasing the value of the walks parameter and the sampling into the corner improves.

One thing that can be useful is tracking the number of accepted steps in the MCMC and using it to increase/decrease the length of the MCMC stage at each iteration to try to maintain some number of points being accepted in each MCMC attempt. This requires care as if the changes in length are too sudden then it can up being biased, but it works pretty well.

walks = 25

walks = 50

walks = 75

walks = 100

This is a histogram of the number of accepted jumps for each of the MCMC lengths. As you can see, the longer versions more frequently have more than no jumps accepted.

Hammer sampling

(The live points look a bit like a hammer.)

Using a unit normal proposal is going to be inefficient, but that can be remedied by using a multi-ellipsoidal sampler where the vertical and horizontal modes should be pretty each to separate. The sampling within these modes following the primary axes should work well.

Other solutions include:

run the chain for longer
find a better proposal distribution, e.g., ensemble evolution moves used in emcee

segasai commented 2 years ago

I am not sure we are converging here.

I think increasing number of steps just hides the problem by making more likely that the steps will be accepted; If they are there is no problem , the problem occurs if they are NOT accepted, therefore we should be testing that

And suggestions you give for a hammer problem are good ones but they are not solving the underlying issue, that the sampling is incorrect.

It could easily be the case that I'm in the wrong here, but because we discussing in the abstract, it's going to be hard to convince what's better.

For example this is the code that samples the 'hammer' posterior using full dynesty dynamic sampler

import dynesty
import numpy as np
import dynesty.utils as dyfunc

def logl(X):
    x, y = X
    r = 1e-3
    scale = 30
    ret1 = -0.5 * (x**2 + y**2) / r**2
    ret2 = -0.5 * ((x / scale)**2 + (y / (1. / scale))**2) / r**2
    return ret1 * (x < 0) + ret2 * (x >= 0)

def cube(x):
    return 2 * x - 1

def doit(sample, bound='single'):
    rstate = np.random.default_rng(1)
    samp = dynesty.DynamicNestedSampler(
        logl,
        cube,
        2,
        nlive=100,
        bound=bound,
        sample=sample,
        # 'rwalk',
        # sample='rslice',
        rstate=rstate)
    samp.run_nested(n_effective=20000)
    samples = samp.results.samples
    wt = np.exp(samp.results.logwt - samp.results.logz[-1])
    samples_equal = dyfunc.resample_equal(samples, wt)
    return samples_equal

And here I compare the rslice, multi results (assume that's the truth; black) vs rwalk, single (red histogram).

Here is the dynesty/master

And here's with your patch

I'd argue that even default dynesty/master is pretty bad, but with the patch the agreement is even worse.

So I'm not convinced that this an improvement.

The reason why it's still useful to test with 'single' bounding is because in high dimensions it's very easy to reach the point where you can't split your ellipsoid more due to numerical constraints (takes ~ 60 points to define 10dim ellipsoid) , so ensuring good 'single' behaviour is important.

ColmTalbot commented 2 years ago

I disagree, I think we are converging well. I agree that the proposed fix does not fix the underlying issue that the dynasty rwalk proposal is not good at efficiently sampling complex spaces. I'll close this PR and continue to think about ways the sampling method could be improved.

Thanks for the discussion!

segasai commented 2 years ago

It's great that we did converge. From my point of view the rslice sampler is much more stable because it's guaranteed to produce a sample always (at the cost of needing more steps for full mixing). In fact one possibility is to utilise rslice as a fall-back option for rwalk (although one has to think about detailed balance)

segasai commented 2 years ago

For what it is worth, here's a patch implementing the idea that in the case of failed rwalk step we do an rslice step instead

diff --git a/py/dynesty/sampling.py b/py/dynesty/sampling.py
index 9e60d2a..ce0f15b 100644
--- a/py/dynesty/sampling.py
+++ b/py/dynesty/sampling.py
@@ -148,8 +148,14 @@ def sample_rwalk(args):
     (u, loglstar, axes, scale, prior_transform, loglikelihood, rseed,
      kwargs) = args
     rstate = get_random_generator(rseed)
-    return generic_random_walk(u, loglstar, axes, scale, prior_transform,
-                               loglikelihood, rstate, kwargs)
+    ret = generic_random_walk(u, loglstar, axes, scale, prior_transform,
+                              loglikelihood, rstate, kwargs)
+    if ret[-1]['accept'] == 0:
+        warnings.warn('Random walk failed to generate a single accepted point.'
+                      ' Generating a proposal using rslice sampler instead')
+        ret1 = sample_rslice(args)
+        return ret1[0], ret1[1], ret1[2], ret1[3] + ret[3], ret[4]
+    return ret

This will not have repeated samples, but also for my test problem doesn't really improve the posterior, so I don't really want to apply this patch as it is. I think what's needed is some tracking of correlation between samples.