JohannesBuchner / UltraNest

Fit and compare complex models reliably and rapidly. Advanced nested sampling.
https://johannesbuchner.github.io/UltraNest/
Other
147 stars 30 forks source link

Very slow/wrong convergence for Lennard-Jones potential #148

Open Binbose opened 1 month ago

Binbose commented 1 month ago

I am trying to calculate the partition function of two particles interacting in the Lennard-Jones potential. I implemented the integral both with UltraNest and with torchquad. With torchquad I get logZ = -0.114 (after a couple seconds) both with Vegas Monte Carlo integration and 3 random seeds and the deterministic Boole strategy. This makes me believe that the torchquad result should be correct. However using UltraNest I get a logZ of about -0.117 after an hour of compute, so now I am unsure what algorithm to believe (and in case torchquad is correct, why UltraNest doesnt converge correctly/so slowly). Maybe I have to use different settings for UltraNest? This is my UltraNest code:

import argparse
import numpy as np
import einops

parser = argparse.ArgumentParser()

parser.add_argument('--n_particle', type=int, default=2,
                    help="Dimensionality")
parser.add_argument("--num_live_points", type=int, default=400)
parser.add_argument('--sigma', type=float, default=0.439)
parser.add_argument('--slice', action='store_true')
parser.add_argument('--slice_steps', type=int, default=100)
parser.add_argument('--log_dir', type=str, default='logs/loggauss')

args = parser.parse_args()

n_particle = args.n_particle
ndim = n_particle * 3
sigma = args.sigma

def pairwise_distances_squared(points):
    pair_indices = np.triu_indices(n_particle, 1)  # Upper triangular indices, excluding diagonal
    i, j = pair_indices

    pairwise_distances_squared = np.sum(np.square(points[:, i] - points[:, j]), axis=-1)

    return pairwise_distances_squared

def log_likelihood(x):
    """
    V12-6 potential for LJ.
    """

    x = einops.rearrange(x, 'batch (n_particle n_space_dim) -> batch n_particle n_space_dim', n_particle=n_particle, n_space_dim=3)
    r2 = pairwise_distances_squared(x / sigma)
    r6 = r2 * r2 * r2
    r12 = r6 * r6
    E = (r12 ** (-1) - r6 ** (-1))

    return -E.sum(-1)

def transform(x):
    return x

paramnames = ['param%d' % (i+1) for i in range(ndim)]

# set up nested sampler:

from ultranest import ReactiveNestedSampler

sampler = ReactiveNestedSampler(paramnames, log_likelihood, transform=transform, 
    #log_dir=args.log_dir + 'RNS-%dd' % ndim, resume=True,
    vectorized=True)

if args.slice:
    # set up step sampler. Here, we use a differential evolution slice sampler:
    import ultranest.stepsampler
    sampler.stepsampler = ultranest.stepsampler.SliceSampler(
        nsteps=args.slice_steps,
        generate_direction=ultranest.stepsampler.generate_mixture_random_direction,
        # log_dir=args.log_dir + 'RNS-%dd' % ndim,
    )

# run sampler, with a few custom arguments:
sampler.run(dlogz= 0.01, #0.5 + 0.1 * ndim,
    update_interval_volume_fraction=0.4 if ndim > 20 else 0.2,
    max_num_improvement_loops=3,
    min_num_live_points=args.num_live_points)

sampler.print_results()

if args.slice:
    sampler.stepsampler.plot(filename = args.log_dir + 'RNS-%dd/stepsampler_stats_regionslice.pdf' % ndim)

sampler.plot()

If used with 1500 live points it results in this output:

python ultra_nest_example.py --num_live_points 1500
[ultranest] To achieve the desired logz accuracy, min_num_live_points was increased to 3163
[ultranest] Sampling 3163 live points from prior ...

Mono-modal Volume: ~exp(-5.17) * Expected Volume: exp(0.00) Quality: ok

param1:  +1.0e-06|*****************************************************************************************************************************************| +1.0e+00
param2:   +0.0000|*****************************************************************************************************************************************|  +1.0000
param3:   +0.0000|*****************************************************************************************************************************************|  +1.0000
param4:  +1.0e-06|*****************************************************************************************************************************************| +1.0e+00
param5:   +0.0000|*****************************************************************************************************************************************|  +1.0000
param6:    +0.000|*****************************************************************************************************************************************|   +1.000

Z=-0.4(72.38%) | Like=0.13..0.25 [0.1284..0.1284]*| it/evals=5056/15615 eff=40.6039% N=3163 63  3 3  163 

Mono-modal Volume: ~exp(-6.51) * Expected Volume: exp(-1.61) Quality: ok

param1:  +1.0e-06|*****************************************************************************************************************************************| +1.0e+00
param2:   +0.0000|*****************************************************************************************************************************************|  +1.0000
param3:   +0.0000|*****************************************************************************************************************************************|  +1.0000
param4:  +1.0e-06|*****************************************************************************************************************************************| +1.0e+00
param5:  +1.0e-06|*****************************************************************************************************************************************| +1.0e+00
param6:  +1.0e-06|*****************************************************************************************************************************************| +1.0e+00

Z=-0.2(94.24%) | Like=0.24..0.25 [0.2406..0.2406]*| it/evals=10169/66431 eff=16.0729% N=3163 

Mono-modal Volume: ~exp(-6.51)   Expected Volume: exp(-3.22) Quality: ok

param1:   +0.0000|*****************************************************************************************************************************************|  +1.0000
param2:   +0.0000|*****************************************************************************************************************************************|  +1.0000
param3:   +0.0000|*****************************************************************************************************************************************|  +1.0000
param4:   +0.0000|*****************************************************************************************************************************************|  +1.0000
param5:  +1.0e-06|*****************************************************************************************************************************************| +1.0e+00
param6:   +0.0000|*****************************************************************************************************************************************|  +1.0000

[ultranest] Explored until L=0.2  0.2490..0.2490]*| it/evals=13819/196253 eff=7.1568% N=3163  
[ultranest] Likelihood function evaluations: 196556
[ultranest]   logZ = -0.1137 +- 0.006664
[ultranest] Effective samples strategy satisfied (ESS = 7022.7, need >400)
[ultranest] Posterior uncertainty strategy is satisfied (KL: 0.46+-0.05 nat, need <0.50 nat)
[ultranest] logz error is dominated by tail. Decrease frac_remain to make progress.
[ultranest] Evidency uncertainty strategy wants 13035 minimum live points (dlogz from 0.01 to 0.01, need <0.01)
[ultranest]   logZ error budget: single: 0.01 bs:0.01 tail:0.02 total:0.02 required:<0.01
[ultranest] Widening roots to 13035 live points (have 3163 already) ...
[ultranest] Sampling 9872 live points from prior ...
[ultranest] Warning: The log-likelihood has a large plateau with L=-1.92197e+15.

  Probably you are returning a low value when the parameters are problematic/unphysical.
  ultranest can handle this correctly, by discarding live points with the same loglikelihood.
  (arxiv:2005.08602 arxiv:2010.13884). To mitigate running out of live points,
  the initial number of live points is increased. But now this has reached over 10000 points.

  You can avoid this making the loglikelihood increase towards where the good region is.
  For example, let's say you have two parameters where the sum must be below 1. Replace this:

    if params[0] + params[1] > 1:
         return -1e300

  with:

    if params[0] + params[1] > 1:
         return -1e300 * (params[0] + params[1])

  The current strategy will continue until 50000 live points are reached.
  It is safe to ignore this warning.

Mono-modal Volume: ~exp(-6.05) * Expected Volume: exp(-0.00) Quality: ok

param1:  +1.0e-06|*****************************************************************************************************************************************| +1.0e+00
param2:  +1.0e-06|*****************************************************************************************************************************************| +1.0e+00
param3:  +1.0e-06|*****************************************************************************************************************************************| +1.0e+00
param4:  +1.0e-06|*****************************************************************************************************************************************| +1.0e+00
param5:  +1.0e-06|*****************************************************************************************************************************************| +1.0e+00
param6:  +1.0e-06|*****************************************************************************************************************************************| +1.0e+00

Z=-0.4(72.61%) | Like=0.12..0.25 [0.1242..0.1242]*| it/evals=20978/246488 eff=39.9551% N=13035 35       5 5     

Mono-modal Volume: ~exp(-7.42) * Expected Volume: exp(-1.61) Quality: ok

param1:  +1.0e-06|*****************************************************************************************************************************************| +1.0e+00
param2:   +0.0000|*****************************************************************************************************************************************|  +1.0000
param3:  +1.0e-06|*****************************************************************************************************************************************| +1.0e+00
param4:  +1.0e-06|*****************************************************************************************************************************************| +1.0e+00
param5:  +1.0e-06|*****************************************************************************************************************************************| +1.0e+00
param6:  +1.0e-06|*****************************************************************************************************************************************| +1.0e+00

Z=-0.2(94.24%) | Like=0.24..0.25 [0.2404..0.2404]*| it/evals=41960/389108 eff=17.4217% N=13035 

Mono-modal Volume: ~exp(-7.99) * Expected Volume: exp(-3.22) Quality: ok

param1:  +1.0e-06|**************************************************************************************************************************************************| +1.0e+00
param2:  +1.0e-06|**************************************************************************************************************************************************| +1.0e+00
param3:  +1.0e-06|**************************************************************************************************************************************************| +1.0e+00
param4:  +1.0e-06|**************************************************************************************************************************************************| +1.0e+00
param5:  +1.0e-06|**************************************************************************************************************************************************| +1.0e+00
param6:   +0.0000|**************************************************************************************************************************************************|  +1.0000

[ultranest] Explored until L=0.2  0.2490..0.2490]*| it/evals=56941/733477 eff=8.1805% N=13035  
[ultranest] Likelihood function evaluations: 733477
[ultranest]   logZ = -0.1169 +- 0.002759
[ultranest] Effective samples strategy satisfied (ESS = 28957.2, need >400)
[ultranest] Posterior uncertainty strategy is satisfied (KL: 0.46+-0.01 nat, need <0.50 nat)
[ultranest] Evidency uncertainty strategy is satisfied (dlogz=0.02, need <0.01)
[ultranest]   logZ error budget: single: 0.00 bs:0.00 tail:0.02 total:0.02 required:<0.01
[ultranest] done iterating.

logZ = -0.117 +- 0.019
  single instance: logZ = -0.117 +- 0.004
  bootstrapped   : logZ = -0.117 +- 0.005
  tail           : logZ = +- 0.018
insert order U test : converged: True correlation: inf iterations

    param1              : 0.00  │▇▇▇▇▇▇▇▆▇▇▇▇▆▇▇▇▇▆▇▇▇▇▇▆▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇│1.00      0.50 +- 0.29
    param2              : 0.00  │▇▇▇▇▇▇▇▇▇▇▇▆▆▆▇▇▇▇▇▆▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇│1.00      0.50 +- 0.29
    param3              : 0.00  │▇▇▇▇▇▇▇▇▆▇▇▇▇▆▆▆▇▇▆▆▆▆▆▇▇▇▇▇▇▇▇▇▇▆▇▇▇▇▇│1.00      0.50 +- 0.29
    param4              : 0.00  │▇▇▇▇▇▇▇▇▇▆▆▆▇▆▆▆▆▆▆▆▇▇▆▆▆▇▆▇▇▇▆▇▇▇▇▇▇▇▇│1.00      0.50 +- 0.29
    param5              : 0.00  │▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▆▇▇▇▇▇▆▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇│1.00      0.50 +- 0.29
    param6              : 0.00  │▇▇▇▇▇▆▇▇▇▆▇▇▇▇▇▇▆▇▆▇▇▆▆▇▆▆▇▆▇▇▇▇▇▇▇▇▇▇▇│1.00      0.50 +- 0.29

For comparison, this is my torchquad code:

n_particle = 2
dim = n_particle * 3
sigma = 0.439

def pairwise_distances_squared(points):
    pair_indices = np.triu_indices(n_particle, 1)  # Upper triangular indices, excluding diagonal
    i, j = pair_indices

    pairwise_distances_squared = np.sum(np.square(points[:, i] - points[:, j]), axis=-1)

    return pairwise_distances_squared

def likelihood(x):
    """
    V12-6 potential for LJ.
    """

    x = einops.rearrange(x, 'batch (n_particle n_space_dim) -> batch n_particle n_space_dim', n_particle=n_particle, n_space_dim=3)
    r2 = pairwise_distances_squared(x / sigma)
    r6 = r2 * r2 * r2
    r12 = r6 * r6
    E = (r12 ** (-1) - r6 ** (-1))

    return np.exp(-E.sum(-1))

set_up_backend("numpy", data_type="float64")

integration_value = torchquad.VEGAS().integrate(fn=likelihood, 
                            dim=dim, 
                            N=6000000, 
                            integration_domain=[[0,1]]*6, 
                            seed=4, 
                            use_grid_improve=True, 
                            eps_rel=0, 
                            eps_abs=0, 
                            max_iterations=40, 
                            use_warmup=True)
print(np.log(integration_value))

def pairwise_distances_squared(points):
    pair_indices = np.triu_indices(n_particle, 1)  # Upper triangular indices, excluding diagonal
    i, j = pair_indices

    pairwise_distances_squared = np.sum(np.square(points[:, i] - points[:, j]), axis=-1)

    return pairwise_distances_squared
def likelihood_safe(x):
    """
    V12-6 potential for LJ.
    """

    x = einops.rearrange(x, 'batch (n_particle n_space_dim) -> batch n_particle n_space_dim', n_particle=n_particle, n_space_dim=3)
    r2 = pairwise_distances_squared(x / sigma)
    r6 = r2 * r2 * r2
    r12 = r6 * r6
    E = ((r12 + 1e-16) ** (-1) - (r6 + 1e-16) ** (-1))

    return np.exp(-E.sum(-1))

integration_value = torchquad.Boole().integrate(fn=likelihood_safe, 
                            dim=dim, 
                            N=50000000, 
                            integration_domain=[[0,1]]*6)
print(np.log(integration_value))

resulting in

-0.11434827520105596
-0.11422851429944744
JohannesBuchner commented 1 month ago

UltraNest reports here -0.117 +- 0.019, so within these uncertainties both outputs agree.

If you want smaller uncertainties from nested sampling, you would need to crank up the number of live points. I am not sure for what purpose one needs logZ with extremely high accuracy though.

Regarding speed, with a slice sampler this is determined by the number of steps, so this is a parameter you could play with, in addition to the direction proposal.

Probably as you go to more parameters, you will find Vegas to be slower than ultranest.