joshspeagle / dynesty

Dynamic Nested Sampling package for computing Bayesian posteriors and evidences
https://dynesty.readthedocs.io/
MIT License
346 stars 76 forks source link

blob issues when add_live #475

Closed carlosRmelo closed 2 months ago

carlosRmelo commented 3 months ago

Dynesty version dynesty v. 2.0.1 - installation via pip

Hi everyone! This may be a naive question or maybe I’m doing something wrong when I continue my sampling to add more new samples.

But the problem is that every time I continue an existing sampling, this adds the current live points again to the blob. And even if I do sampler._remove_live_points(), this does not remove the added information in the blob. I checked the list of keywords in _remove_live_points(), and I noticed that ‘blob’ is not there, but I think it should be, right?

Not sure if I made myself clear enough, in case not, let me know and I'll try to elaborate. Thanks to everyone in advance.

segasai commented 3 months ago

Hi,

I'm not sure I fully follow. Could you please provide an example/demonstration ? If you think it is an issue please follow something close to the template https://github.com/joshspeagle/dynesty/blob/master/.github/ISSUE_TEMPLATE/bug_report.md Also it'd be a bit better if you used the latest released dynesty version.

Thanks

carlosRmelo commented 3 months ago

Hi, thanks for your time and sorry for not being clear before. I paste a toy example below to show my issue. Hope it will be clearer.

My issue is that the number of blobs does not match the number of samples if I continue adding more samples. However, I expected the number of blobs should be the same as the number of samples.

Now using dynesty version: 2.1.3 Installation via pip

import dynesty
import numpy as np
from dynesty import NestedSampler
rstate= np.random.default_rng(56101)

# truth
m_true = -0.9594
b_true = 4.294
f_true = 0.534

# generate mock data
N = 50
x = np.sort(10 * rstate.uniform(size=N))
yerr = 0.1 + 0.5 * rstate.uniform(size=N)
y_true = m_true * x + b_true
y = y_true + np.abs(f_true * y_true) * rstate.normal(size=N)
y += yerr * rstate.normal(size=N)

# log-likelihood
def loglike(theta):
    ## I also want to return 'model' bellow as a blob 
    m, b, lnf = theta
    model = m * x + b
    inv_sigma2 = 1.0 / (yerr**2 + model**2 * np.exp(2 * lnf))
    figure_of_merit = -0.5 * (np.sum((y-model)**2 * inv_sigma2 - np.log(inv_sigma2)))

    return figure_of_merit, model

# prior transform
def prior_transform(utheta):
    um, ub, ulf = utheta
    m = 5.5 * um - 5.
    b = 10. * ub
    lnf = 11. * ulf - 10.

    return m, b, lnf

## sampler
sampler = NestedSampler(loglike, prior_transform, 
                                    3, blob=True, nlive=50)

My goal is to run this for X iterations and perform some outputs. Check convergence (dlogz < threshold), if not satisfied, run for more X iterations.

sampler.run_nested(maxiter=50)

I expected the blob to have a shape (N, 50), with N being the number of samples. This works fine if I run only once. In this case,

## sampler.results.samples.shape = (101, 3)
## sampler.results.blob.shape    = (101, 50)

print(sampler.results.samples.shape, sampler.results.blob.shape)

Adding more samples sampler.run_nested(maxiter=100)

But now I have:

## sampler.results.samples.shape = (202,3)
## sampler.results.blob.shape    = (252,50)

print(sampler.results.samples.shape, sampler.results.blob.shape)

However, I expected to have the same number of blobs as the number of samples

Doing it again: sampler.run_nested(maxiter=100)

I have the same issue as above. Also, note that looks like the number of live points are being added.

## sampler.results.samples.shape = (303,3)
## sampler.results.blob.shape      = (403,50)

print(sampler.results.samples.shape, sampler.results.blob.shape)

Checking the file sampler.py, looks like every time .run_nested()is called, the live points of a previous run are removed if they were added (see lines 720-732). Looking at _remove_live_points(), there is a list with the keywords where the live points should be removed, but I can't see blob there. However, I suppose, it should be there, right?

segasai commented 3 months ago

Thanks for the details! That looks like a bug indeed. I will try to fix in next few days (and add an associated test). To be honest the execution of run_nested() several times (without resume=) is something of a weird animal, that I'd prefer to deprecate, because it is not exactly clear what it means.

carlosRmelo commented 3 months ago

Thanks for the quick response. And indeed, now that you mentioned the weirdness of doing run_nested() several times, if I run the same thing as above, but externally using the generator I don't have these issues anymore.