Closed gdesvignes closed 2 years ago
This is the behaviour when the live points slowly become linearly dependent and collapse onto a sub-space.
This means you need more steps in the step sampler.
You may also want to try different generate_direction
methods, such as unit directions.
Thanks for the quick feedback. After some more testing, I found that the main issue was the likelihood, previously computed in single precision only. Then I doubled the number of steps, but it has become very slow and did not yet converge after a few days although it seems very close to it. In addition, I get a lot of "wandered out of L constraint; resetting". I would need to do some proper benchmarking, and maybe compare against PolyChord, but it seems the code spends a lot of time creating the live points.
Again, thanks for your input and this awesome package.
Yes.
Setting region_class=ultranest.mlfriends.RobustEllipsoidRegion
should help a lot. This makes only ellipsoids instead of also constructing MLFriends regions (which only help with <20d, and are computationally costly in high-d).
"wandered out of L constraint; resetting" means that a walker is, after a new iteration has increased the likelihood constraint, now outside, and has to be restarted. This only happens when you are running distributed with MPI, which maintains multiple walkers.
Recently, I completed implemention of a much faster, vectorized step sampler. I verified that it works correctly on a 100d gaussian. Perhaps you can try it as well, and give me feedback.
Here is how to run it. First import a few things:
from ultranest import ReactiveNestedSampler
from ultranest.mlfriends import RobustEllipsoidRegion
from ultranest.popstepsampler import PopulationSliceSampler, generate_cube_oriented_direction
Then make the sampler as usual. You need a vectorized likelihood to see speed-ups:
sampler = ReactiveNestedSampler(paramnames, loglike, transform=transform, vectorized=True)
Here is the definition of the step sampler:
sampler.stepsampler = PopulationSliceSampler(
popsize=popsize, nsteps=nsteps,
generate_direction=generate_cube_oriented_direction, log=verbose
)
Choose the population of walkers to maintain, popsize. Perhaps try 40 to start. Choose the number of steps a walker makes. Perhaps start with 3 * ndim.
Then send it off. Here are the arguments I used:
results = sampler.run(
frac_remain=0.01, update_interval_volume_fraction=0.01,
max_num_improvement_loops=0, min_num_live_points=400,
viz_callback=None, region_class=RobustEllipsoidRegion
)
A low update_interval_volume_fraction means the region construction only happens very rarely. Probably you want min_num_live_points to be at least a few times the dimensionality, so your 2000 sounds reasonable.
The live points decreasing is caused by plateaus, i.e., multiple live points having the same likelihood value; they have to be removed together, and are only replaced by one new live point. See here.
Please investigate why multiple points cause the same likelihood value. If they have the exact same parameter values, the step sampler got stuck / did not make a significant move. If not, the likelihood is not well-defined.
Please reopen if this is still an issue.
Description
I'm trying to model a dataset of a few thousand points and I wrote a likelihood function for GPU in CUDA that uses single precision float. The model has 211 parameters so I'm using a step sampler as shown below with over 2000 live points. Using less live points make the code apparently detect lots of cluster whereas the posterior should be mono-modal. Throughout the run, the number of live points slowly decreases as a large number of plateaux are supposedly found. Until after several hours and a well advanced sampling, only ~350 live points remain, clusters are detected and the code eventually crashes.
Is there a way to force the number of live points to remain fix? Or am I missing something else? Any feedback would be appreciated.
What I Did
Output: