Explore/characterize non-dominant modes in posterior

jaoleary commented 3 months ago

UltraNest version: 4.2.0
Python version: 3.10.12
Operating System: Ubuntu 20.04.6

Description

Problem: I would like to identify and characterize some local modes in addition to the global solution.

Setup: As a test case run The ReactiveNestedSampler on a two-dimensional Styblinksi and Tang function. The test problem has a single global minima and three additional local minima (see image below)

Outcome: As expected the sampler converges to the global solution in the test problem.

During the run, the local modes are identified and labeled but disappear as the model converges (as expected).

Question:

Can I read/extract/isolate the other non-dominant modes from the model outputs
How might one more systematically identify the top N peaks or something similar

What I Did

Using the code below I also:

adjusted frac_remain
increased Lepsilon
set max_num_improvement_loops=0

import benchmark_functions as bf
import ultranest as un

log_dir = "./"
n_dim = 2**1

likelihood = bf.StyblinskiTang(n_dimensions=n_dim, opposite=True)

def transform(cube):
    params = cube.copy()
    lo = -5
    hi = 5
    params = cube * (hi - lo) + lo
    return params

sampler = un.ReactiveNestedSampler(
    [f"x{i}" for i in range(n_dim)],
    likelihood,
    transform,
    log_dir=log_dir,
    run_num=0,
)
result = sampler.run()
sampler.print_results()

fig, ax = traceplot(
    results=sampler.run_sequence,
    labels=sampler.paramnames,
    quantiles=[0.14, 0.5, 0.86],
)

JohannesBuchner commented 3 months ago

You can select posterior samples (weighted or unweighted) by cuts, and compute their relative probability from that.

For a rigorious automated approach, we would need a mathematical definition that identifies a local minimum, which requires choosing a threshold.

jaoleary commented 3 months ago

if at any point during the run more than one mode is identified (as below)

are these cluster values stored in the output anywhere or is this information discarded? I see that current cluster information is stored in the transformLayer

JohannesBuchner commented 3 months ago

Yeah, to achieve a high computational efficiency, this information is overwritten in place. You could write a viz_callback function and store it.

Alternatively, you could have a look at these two functions:

https://github.com/JohannesBuchner/UltraNest/blob/master/ultranest/integrator.py#L3048 read_file can reads a finished ultranest output file (points.h5).
https://github.com/JohannesBuchner/UltraNest/blob/master/ultranest/netiter.py#L976 logz_sequence can integrate up the results
the main ultranest loop does something very similar, but builds a MLFriends region, see this chunk from the while until the call to viz_callback https://github.com/JohannesBuchner/UltraNest/blob/master/ultranest/integrator.py#L2562, in particular, the _update_region_bootstrap function.

You could combine these three codes to read a file, build an initial MLFriends region, and in the loop update the region every N iterations, and observe the number of clusters.

JohannesBuchner commented 3 months ago

.... updated a link

jaoleary commented 3 months ago

great thank for the feedback, i will try that!

JohannesBuchner / UltraNest

Explore/characterize non-dominant modes in posterior #137

Description

What I Did