TuringLang / NestedSamplers.jl

Implementations of single and multi-ellipsoid nested sampling
https://turinglang.org/NestedSamplers.jl/stable/
MIT License
42 stars 8 forks source link

Question regarding Gaussian shells example: what is logz? #62

Closed ngiann closed 3 years ago

ngiann commented 3 years ago

Though I have not used nested sampling before, I know that it does calculate the log-evidence of the model.

I have been looking at the very nice example regarding the Gaussian shells. I have run the example code and indeed it runs exactly like detailed in the documentation.

However: I am not entirely sure what logz is. Is it the negative log-evidence or the log-evidence?

As an attempt to find out for myself, I decided to integrate the likelihood function provided by Models.GaussianShells() (which I suppose implicitly defines a "flat" uniform prior on the two parameters).

To that end, I wrote the following function:

using NestedSamplers

function checkinteral(dx=1e-3)

    # get log-likelihood function

    model, = Models.GaussianShells()

    # convenience function to calculate likelihood

    aux(x) = exp(model.loglike(x))

    # do naive riemann sum to approximate integration of likelihood

    xgrid = -6.0:dx:6.0

    cumsum = 0.0

    for x1 in xgrid
        for x2 in xgrid
            cumsum += aux([x1; x2])
        end
    end

    cumsum * dx * dx

end

When calling checkinteral() I obtain the result of 25.1327... which I cannot make it agree with the logz=-1.75 given in the example, i.e. `log(25.1327)=3.2241'.

Furthermore (because one verification is hardly ever sufficient!), I did an additional check, this time integrating using the package HCubature. The additional check is implemented as follows:

using HCubature

function onemorecheck()

    model, = Models.GaussianShells()

    hcubature(x->exp(model.loglike(x)), -6*ones(2), 6*ones(2))

end

Calling onemorecheck() gives me again 25.1327....

Could somebody please help me out? Thanks.

ngiann commented 3 years ago

OK, I just realised that the log-likelihood function does not contain the prior.

Replacing in the above script function aux with aux(x) = exp(model.loglike(x)) * pdf(Uniform(-6, 6), x[1]) * pdf(Uniform(-6, 6), x[2]), gives me the desired result.

Below the new script:

using NestedSamplers, Distributions

function checkinteral(dx=1e-3)

    # get log-likelihood function

    model, = Models.GaussianShells()

    # convenience function to calculate likelihood

    aux(x) = exp(model.loglike(x)) * pdf(Uniform(-6, 6), x[1]) * pdf(Uniform(-6, 6), x[2])

    # do naive riemann sum to approximate integration of likelihood

    xgrid = -6.0:dx:6.0

    cumsum = 0.0

    for x1 in xgrid
        for x2 in xgrid
            cumsum += aux([x1; x2])
        end
    end

    cumsum * dx * dx

end

I thought I should delete my issue, but I leave this here instead in case somebody finds it useful. Thanks for the work put in this package.

mileslucas commented 3 years ago

Sorry for the late reply-

Indeed the Bayesian evidence is analytically the integral of the likelihood times the prior over all of parameter space (https://en.wikipedia.org/wiki/Marginal_likelihood). Here "likelihood" and "prior" are literal to the Bayesian terminology. This is contrary to what I would call a "loss" function that you might use for numerical optimization, which manually applies Bayes' rule in the function to optimize.