Open yebai opened 4 years ago
""" Nested sampling algorithm.
#Note-to-self: set a seed here
julia> using NestedSamplers
julia> using Distributions
# eggbox likelihood function
tmax = 3π
julia> function logl(x)
t = @. 2*tmax*x - tmax
return 2 + cos(t[1]/2)*cos(t[2]/2)^5
end
logl (generic function with 1 method)
# prior constrined to line in the range (0, 20)
julia> prior = [
Uniform(0, 20),
Uniform(0, 20)
]
2-element Array{Uniform{Float64},1}:
Uniform{Float64}(a=0.0, b=20.0)
Uniform{Float64}(a=0.0, b=20.0)
# creating the model
julia> model = NestedModel(logl, prior)
NestedModel{typeof(logl),Uniform{Float64}}(logl, Uniform{Float64}[Uniform{Float64}(a=0.0, b=20.0), Uniform{Float64}(a=0.0, b=20.0)])
julia> using StatsBase: sample, Weights
julia> using MCMCChains: Chains
# creating the sampler
# 2 parameters, 100 active points, multi-ellipsoid
julia> spl = Nested(2, 100; bounds = Bounds.MultiEllipsoid)
Nested(ndims=2, nactive=100, enlarge=1.25, update_interval=150)
bounds=MultiEllipsoid{Float64}(ndims=2)
proposal=NestedSamplers.Proposals.Uniform
logz=-1.0e300
log_vol=-4.610166019324897
H=0.0
julia> spl = Nested(2, 100; bounds = Bounds.MultiEllipsoid)
Nested(ndims=2, nactive=100, enlarge=1.25, update_interval=150)
bounds=MultiEllipsoid{Float64}(ndims=2)
proposal=NestedSamplers.Proposals.Uniform
logz=-1.0e300
log_vol=-4.610166019324897
H=0.0
julia> chain = sample(model, spl;
dlogz = 0.2,
param_names = ["x", "y"],
chain_type = Chains)
Object of type Chains, with data of type 358×3×1 Array{Float64,3}
Log evidence = 2.086139406693275
Iterations = 1:358
Thinning interval = 1
Chains = 1
Samples per chain = 358
internals = weights
parameters = x, y
2-element Array{MCMCChains.ChainDataFrame,1}
Summary Statistics
parameters mean std naive_se mcse ess r_hat
────────── ─────── ────── ──────── ────── ──────── ──────
x 9.9468 5.7734 0.3051 0.5474 315.7439 1.0017
y 10.0991 5.5946 0.2957 0.2016 396.0076 0.9995
Quantiles
parameters 2.5% 25.0% 50.0% 75.0% 97.5%
────────── ────── ────── ────── ─────── ───────
x 0.4682 5.2563 9.5599 14.9525 19.3789
y 0.5608 5.4179 9.8601 14.8477 19.2020
"""
Getting the following error while using bounds = Bounds.MultiEllipsoid
argument in the function Nested
ERROR: UndefVarError: Bounds not defined
Are you on version 0.3?
Not yet.. i used NestedSamplers version 0.1 with Julia version 1.2 .. updating it all now
On Wed, Jun 3, 2020 at 12:08 AM Miles Lucas notifications@github.com wrote:
Are you on version 0.3?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/TuringLang/NestedSamplers.jl/issues/29#issuecomment-637734396, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGZ32SB72SZV3FCUCUPN2JDRUVBLXANCNFSM4M3NTL4Q .
Any news on this? Integration between TuringLang and NestedSamplers would be a dream.
The integration with nested sampling should become trivial once https://github.com/TuringLang/DynamicPPL.jl/pull/309 is merged.
cc @ParadaCarleton
Great! Does the PR include an example of this integration by any chance?
Thank you for your hard work.
The few times I've tried figuring out how to do this integration, I've given up because I didn't understand how Turing worked internally. It was never clear to me how to compute the prior transform or loglikelihood methods using the DynamicPPL API. For example, in the current NestedSamplers framework, I need to know the number of parameters, and have a loglike(X::AbstractVector)
and prior_transform(X::AbstractVector)
methods. It was never clear to me how I would construct, e.g., the prior_transform
from a bunch of assume
calls, or the likelihood function from observe
calls. As far as I know, nested sampling is not build to operate on single parameters at a time, but rather samples from the entire high-dimensional parameter space, since we have to generate new samples contingent on the current iso-likelihood contour.
@yebai how will https://github.com/TuringLang/DynamicPPL.jl/pull/309 make this trivial? It is not clear to me from the PR discussion.
That PR will essentially just make it easier to do something like
loglikelihood(model, namedtuple)
or
loglikelihood(model, vector)
etc. with very high-performance, i.e. make it super-easy to evaluate the density of (a large subset of but not all definable) models.
This would mean that you could make NestedSamplers.jl work with Model
without implenting stuff like assume
, observe
, dot_assume
, etc. for the particular samplers.
That PR will essentially just make it easier to do something like
loglikelihood(model, namedtuple)
or
loglikelihood(model, vector)
etc. with very high-performance, i.e. make it super-easy to evaluate the density of (a large subset of but not all definable) models.
This would mean that you could make NestedSamplers.jl work with
Model
without implenting stuff likeassume
,observe
,dot_assume
, etc. for the particular samplers.
What kinds of models might not work? And is there a timeline on the PR?
Next to evaluating the likelihood, how would one efficiently implement the prior_transform
?
What kinds of models might not work?
Well, all models would technically work but depending on what the underlying storage is, you might need to do some manual labour:
NamedTuple
, if you have access patterns that looks like x[i] ~ ...
, etc. i.e. you're sampling parts of some container, then you need to pre-allocate the container when constructing the "trace"/SimpleVarInfo
, e.g. SimpleVarInfo((x = zeros(n), ))
. If you're not sampling parts of some container, i.e. all your statements are of the form x ~ ...
, then it will just work and be blazingly fast.Dict
we can support statements such as x[i] ~ ...
without any input from the user, but it'll be much slower than using NamedTuple
.You can check out the docs for the PR here: https://turinglang.github.io/DynamicPPL.jl/previews/PR309/#DynamicPPL.SimpleVarInfo
And docstring for logjoint
, etc.: https://turinglang.github.io/DynamicPPL.jl/previews/PR309/#DynamicPPL.logjoint-Tuple{Model,%20Union{AbstractDict,%20NamedTuple}}
And is there a timeline on the PR?
This is open-source bby, there are no timelines! :guitar:
Nah I kid:) That PR is my highest priority in my open-source work, so it shouldn't be long now. I'm just ironing out some quirks and making sure that all test-cases are covered. This is a pretty significant change, so we need to make sure that it's not a bad idea.
Next to evaluating the likelihood, how would one efficiently implement the prior_transform?
So this is actually something I'm somewhat confused about: why do we need to work on the unit-cube? I'm new to nested sampling, so please forgive my ignorance :upside_down_face:
But in general, we deal with transformations between spaces in Turing.jl too; in fact we have an entire package to deal with this (Bijectors.jl). So IIUC, this shouldn't be an issue since we can just provide the transform using a composition of the bijectors. But we could also provide a function which extracts the distributions used, if need be.
So this is actually something I'm somewhat confused about: why do we need to work on the unit-cube? I'm new to nested sampling, so please forgive my ignorance 🙃
This is not strictly necessary from a technical point of view. In order to perform nested sampling, one only need to sample from a constrained prior, or sample from the unconstrained prior, then perform a rejection sampling step. However, it seems very popular in astrophysics to sample from a unit cube to simplify implementation. @mileslucas Would it be possible to directly work with the prior?
It seems the current ‘prior_transform’ is coupled with proposal implementations. That makes working directly with priors slightly more involved - we need to implement new proposals that operate in the parameter space instead of the unit-cube space. The current implementation also limits the applicable model of nested sampling to cases whose prior can be factorised into univariate distributions with known quantile functions.
I think a near-term integration goal should be supporting the same family of models that nested sampling can handle, i.e. models with priors that can be written as a product of independent univariate distributions.
This is not strictly necessary from a technical point of view. In order to perform nested sampling, one only need to sample from a constrained prior, or sample from the unconstrained prior, then perform a rejection sampling step. However, it seems very popular in astrophysics to sample from a unit cube to simplify implementation. @mileslucas Would it be possible to directly work with the prior?
Initially it seemed like it's just that it's convenient, e.g. https://dynesty.readthedocs.io/en/latest/overview.html#challenges:
In both cases, it is much easier to deal with uniform (rather than arbitrary) priors. As a result, most nested sampling algorithms/packages (including dynesty) are designed to sample within the 𝐷-dimensional unit cube.
But it seems to go a bit deeper than this, e.g. MultiNest
partitions the active points into clusters before constructing the ellpsoidal bounds and this clustering apparently requires the points to be uniformly distributed in the parameter space (see §5.2) [^MUL09]. AFAIK, in general, the only way you can make the prior look uniformly distributed in the parameter space is through quantile transformations. With that being said, it doesn't seem like the uniform assumptions is always required, rather this is a particular requirement for the clustering approach taken by MultiNest
.
[^MUL09]: Feroz, F., Hobson, M. P., & Bridges, M., Multinest: an efficient and robust bayesian inference tool for cosmology and particle physics, Monthly Notices of the Royal Astronomical Society, 398(4), 1601–1614 (2009). http://dx.doi.org/10.1111/j.1365-2966.2009.14548.x
The integration with nested sampling should become trivial once TuringLang/DynamicPPL.jl#309 is merged.
cc @ParadaCarleton
Any news on this?
I’m working on it, although I’ve been pretty busy these past few weeks. If I had to make a guess, maybe next month? But it depends on a lot.
That's very nice to hear. Thank you for your work. If I can help with testing stuff, let me know.
On Fri, Feb 18, 2022 at 3:12 AM Carlos Parada @.***> wrote:
I’m working on it, although I’ve been pretty busy these past few weeks.
— Reply to this email directly, view it on GitHub https://github.com/TuringLang/NestedSamplers.jl/issues/29#issuecomment-1043740902, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKJ35C6LTZ5IFVRFGMUMMJDU3WTJ3ANCNFSM4M3NTL4Q . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
You are receiving this because you commented.Message ID: @.***>
Bump
NestedSamplers is an independent sampler library at the moment and lacks an interface for models specified by the domain-specific language (DSL) in Turing. However, since NestedSampler supports the
AbstractMCMC
interface, it should be a relatively lightweight task to add such an interface. A good starting point is the elliptical slice sampler (ESS) interface, which can be found at