Closed JohannesBuchner closed 4 years ago
Hi Johannes,
Could you post a piece of code so that I can reproduce the problem?
The reason we chose to implement the stepping-out procedure instead of doubling procedure is that the latter entails an additional "testing" step, thus increasing the number of likelihood evaluations.
Furthermore, once the length scale is tuned (usually after a few steps), zeus performs on average just one step-out (expansion).
If you're interested in more details, I recommend the 2003 Slice Sampling paper by Radford Neal.
I'm happy to answer any questions if I can.
Cheers, Minas
ndim = 100
sigmamin = 10**(-10 + ndim**0.5/2)
sigma = np.logspace(-1, np.log10(sigmamin), ndim)
width = 1 - 5 * sigma
width[width < 1e-20] = 1e-20
centers = (np.sin(np.arange(ndim)/2.) * width + 1.) / 2.
def loglike(theta):
like = -0.5 * (((theta - centers)/sigma)**2).sum(axis=1) - 0.5 * np.log(2 * np.pi * sigma**2).sum()
like[~np.isfinite(like)] = -1e300
return like
That vectorized likelihood function is explored on the unit hypercube (0 < theta < 1).
The problem here is that the mode is very well localised in small region of a highly dimensional parameter space.
MCMC algorithms struggle with this situation because of the way we initialise the walkers. In particular, If you randomly initialise the walkers inside the unit hypercube then all of them would be in positions of vanishingly small probability.
A conventional (rejection-based) MCMC algorithm will get stuck (acceptance rate will become zero). Zeus on the other hand is a non-rejection method so it will try very hard to adjust its scaling parameter by performing too many expansions/contractions.
There's an easy way to get around this problem and this is to initialise the walkers in the peak/mode. You can easily do this by first finding the maximum likelihood estimate.
I hope this is helpful.
Minas
Did you try by initialising the walkers as numpy.random.normal(centers, sigma, size=(nwalkers, ndim))
?
I tried start = 0.0001 * np.random.randn(walkers,ndim) + centers
which is a bit more agnostic than what you suggested. It worked like a charm.
Could you show me the way you defined the log prior?
Yeah, my sentence about the hypercube was supposed to define the prior, not the walker initialisation. I guess that was confusing.
Sure, but did you manage to make it work with start = 0.0001 * np.random.randn(walkers,ndim) + centers
or numpy.random.normal(centers, sigma, size=(nwalkers, ndim))
?
I'm not sure I can see the problem.
I'm gonna close this issue for now. We can open it again if you have any updates.
Is there a particular reason why the stepping out is done in linear steps rather than doubling the step width (as in other implementations)? I am running into the limit (1e4) when testing on a 100d gaussian with std ranging from 1e-1 ... 1e-9, and increasing the limit did not solve this.