JohannesBuchner / UltraNest

Fit and compare complex models reliably and rapidly. Advanced nested sampling.
https://johannesbuchner.github.io/UltraNest/
Other
142 stars 30 forks source link

Explore/characterize non-dominant modes in posterior #137

Closed jaoleary closed 3 months ago

jaoleary commented 3 months ago

Description

Problem: I would like to identify and characterize some local modes in addition to the global solution.

Setup: As a test case run The ReactiveNestedSampler on a two-dimensional Styblinksi and Tang function. The test problem has a single global minima and three additional local minima (see image below)

image

Outcome: As expected the sampler converges to the global solution in the test problem.

image

During the run, the local modes are identified and labeled but disappear as the model converges (as expected).

Question:

What I Did

Using the code below I also:

import benchmark_functions as bf
import ultranest as un

log_dir = "./"
n_dim = 2**1

likelihood = bf.StyblinskiTang(n_dimensions=n_dim, opposite=True)

def transform(cube):
    params = cube.copy()
    lo = -5
    hi = 5
    params = cube * (hi - lo) + lo
    return params

sampler = un.ReactiveNestedSampler(
    [f"x{i}" for i in range(n_dim)],
    likelihood,
    transform,
    log_dir=log_dir,
    run_num=0,
)
result = sampler.run()
sampler.print_results()

fig, ax = traceplot(
    results=sampler.run_sequence,
    labels=sampler.paramnames,
    quantiles=[0.14, 0.5, 0.86],
)
JohannesBuchner commented 3 months ago

You can select posterior samples (weighted or unweighted) by cuts, and compute their relative probability from that.

For a rigorious automated approach, we would need a mathematical definition that identifies a local minimum, which requires choosing a threshold.

jaoleary commented 3 months ago

if at any point during the run more than one mode is identified (as below)

image

are these cluster values stored in the output anywhere or is this information discarded? I see that current cluster information is stored in the transformLayer

JohannesBuchner commented 3 months ago

Yeah, to achieve a high computational efficiency, this information is overwritten in place. You could write a viz_callback function and store it.

Alternatively, you could have a look at these two functions:

You could combine these three codes to read a file, build an initial MLFriends region, and in the loop update the region every N iterations, and observe the number of clusters.

JohannesBuchner commented 3 months ago

.... updated a link

jaoleary commented 3 months ago

great thank for the feedback, i will try that!