JohannesBuchner / UltraNest

Fit and compare complex models reliably and rapidly. Advanced nested sampling.
https://johannesbuchner.github.io/UltraNest/
Other
142 stars 30 forks source link

ReactiveNestedSampler finishing on the first iteration #60

Closed depra closed 2 years ago

depra commented 2 years ago

Description

I am trying to model two observational data: an observed curve, and another physical information (hereafter, curve and physical). The generative model has 22 parameters, and it gives me these two pieces of information separately. Of these parameters, 11 of them are fractions, that must sum to 1 (for which I am using a dirichlet prior), and 11 come from an uniform distribution. I am defining my problem such that the likelihood outside the physical uncertainty is -1e300 (in the example bellow it should be between 0.5 and 0.7).

The problem is that on the first iteration, all the sampled points are not within physical uncertainty limits, so all the live points have same likelihood=-1e300. With that the code is finishing on the very first iteration, and returning:

logZ = -1000000000000000052504760255204420248704468581108159154915854115511802457988908195786371375080447864043704443832883878176942523235360430575644792184786706982848387200926575803737830233794788090059368953234970799945081119038967640880074652742780142494579258788820056842838115669472196386865459400540160.000 +- 0.000
  single instance: logZ = -1000000000000000052504760255204420248704468581108159154915854115511802457988908195786371375080447864043704443832883878176942523235360430575644792184786706982848387200926575803737830233794788090059368953234970799945081119038967640880074652742780142494579258788820056842838115669472196386865459400540160.000 +- nan
  bootstrapped   : logZ = -1000000000000000201206451102982726528510718396098215168041874281451248363566094127380437091120885218560535893448518937156814902254657735621103316739277277761931445311166038382034914278540775484328009936664744486969000697274111361486849523430568151310289152823685865144042626214886587669241994282008576.000 +- 0.000
  tail           : logZ = +- 0.000
insert order U test : converged: True correlation: inf iterations

    0                   0.090 +- 0.083
    1                   0.089 +- 0.082
    2                   0.093 +- 0.084
    3                   0.090 +- 0.080
    4                   0.091 +- 0.082
    5                   0.090 +- 0.082
    6                   0.090 +- 0.081
    7                   0.091 +- 0.083
    8                   0.092 +- 0.082
    9                   0.090 +- 0.082
    10                  0.093 +- 0.085
    11                  : 0.00  │▆▇▆▆▅▆▆▇▇▆▆▄▆▆▆▆▇▇▇▆▆▅▆▆▇▇▆▆▇▇▆▇▆▆▇▆▆▇▆│1.00      0.51 +- 0.29
    12                  : 0.00  │▇▆▆▆▇▆▇▇▅▇▆▆▇▆▇▇▇▆▆▇▆▆▇▆▇▅▇▆▆▅▇▆▆▇▆▇▇▆▆│1.00      0.50 +- 0.29
    13                  : 0.00  │▅▇▆▆▆▆▇▅▆▅▇▇▆▇▇▆▆▆▇▆▆▅▇▅▆▆▆▇▆▆▆▇▆▅▆▇▆▆▇│1.00      0.51 +- 0.29
    14                  : 0.00  │▆▇▆▇▆▆▇▇▇▇▇▇▇▆▇▆▆▇▆▇▇▇▅▆▆▇▇▇▇▇▇▆▆▅▇▇▆▇▆│1.00      0.50 +- 0.29
    15                  : 0.00  │▇▇▆▆▆▇▆▆▆▆▇▇▇▅▆▆▆▆▆▆▆▇▆▇▆▆▇▆▆▆▇▆▅▇▆▆▇▆▇│1.00      0.50 +- 0.29
    16                  : 0.00  │▆▇▇▇▇▆▅▆▇▆▇▇▆▆▆▇▆▆▇▆▆▇▇▇▇▇▇▇▆▇▇▇▆▇▇▇▆▆▆│1.00      0.50 +- 0.29
    17                  : 0.00  │▆▆▆▅▇▅▆▆▆▆▆▆▆▇▆▇▅▇▅▇▆▇▆▅▆▆▇▅▆▇▅▆▆▇▆▇▅▆▇│1.00      0.50 +- 0.29
    18                  : 0.00  │▇▆▇▆▇▆▇▅▆▇▆▇▆▇▅▆▇▅▇▇▆▅▆▇▆▇▇▇▇▆▆▇▇▇▆▇▅▇▇│1.00      0.50 +- 0.29
    19                  : 0.00  │▆▆▆▆▆▆▅▆▇▇▆▆▆▆▆▆▇▆▆▆▆▅▆▆▆▆▆▆▆▇▇▇▆▇▆▆▆▅▇│1.00      0.50 +- 0.29
    20                  : 0.00  │▆▇▆▇▇▇▇▆▇▇▇▇▇▇▆▇▇▆▇▇▇▆▇▇▆▇▆▆▅▇▇▇▆▅▅▇▇▇▆│1.00      0.50 +- 0.29
    21                  : 0.00  │▇▆▆▇▇▇▆▇▇▆▇▆▆▆▆▆▆▆▇▅▇▆▅▇▆▇▆▇▇▇▆▆▆▆▇▇▅▆▇│1.00      0.50 +- 0.29

In this case the correct values would be [0, 0.01, 0, 0.98, 0, 0, 0.01, 0, 0, 0, 0, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5], where the first 11 terms sum to 1, and the remaining can be any value within 0 and 1.

What I Did

def likelihood(theta):        
        curve, val = generative_model(theta)
        if (val >= 0.5) & (val <= 0.7):
            log_like = metric(data, curve)
        else:
            log_like = -1e300
        return log_like

def prior(theta):
        fraction = theta[:11]
        fraction = - np.log(fraction)
        theta_fraction = fraction / np.sum(fraction)
        other = np.array(theta[11:])
        theta = np.concatenate([theta_fraction, other])
        return theta```

def run():
        param_names = [str(i) for i in range(22)]
        sampler = ultranest.ReactiveNestedSampler(param_names, likelihood, self.prior_transform)

        stepsampler = ultranest.stepsampler.SliceSampler(40, region_filter=True, adaptive_nsteps='move-distance',
                                                               generate_direction=ultranest.stepsampler.generate_cube_oriented_direction)
        sampler.run(min_num_live_points=1000)
        return sampler

I tried different generate_direction funtions and stepsampler classes, but I keep getting the same results. Is there any recommendation to avoid this from happening?

JohannesBuchner commented 2 years ago

As the debug log will confirm, you have a likelihood plateau, with multiple likelihood points having the same value. To support plateaus within nested sampling, the live points need to be reduced. If all live points have the same value, the problem seems trivially solved (uninformative likelihood, prior=posterior).

I would suggest you modify your likelihood so that you do not return just -1e300, but introduce a slight slope which increases towards the physical space. For example, in the else clause, set log_like = -1e300 * np.abs(val - 0.6)

Even better would be if you defined your prior so that that unphysical space cannot be proposed.

JohannesBuchner commented 2 years ago

To your last question, this is independent of the proposal.

depra commented 2 years ago

Thank you @JohannesBuchner! I will try both approaches!