Closed pboorm closed 3 years ago
Could you provide a data set which reproduces the problem, so that the example is self-contained?
The warning is not problematic, it just lets you know that the problem is difficult. Fitting a line with relatively few data points is actually a really hard problem to do correctly, as the distribution tails become important. UltraNest tries hard to do it correctly.
You can also try to use a "step sampler", which can overcome heavy tails and solve high-dimensional problems.
sampler.stepsampler = ultranest.stepsampler.RegionSliceSampler(nsteps=100, adaptive_nsteps='move-distance')
This is mentioned in the high-dim tutorial https://johannesbuchner.github.io/UltraNest/example-sine-highd.html
Thanks a lot for the suggestion! This has enabled the difficult fits to finish without an efficiency warning by setting nsteps to a fixed value (instead of the adaptive_nsteps method). Linked to this, I have a couple of follow-up queries:
If you use adaptive_nsteps='move-distance'
, the number of steps is varied to try to adjust to the problem difficulty, nsteps
is only the starting value. It may be that it tunes to very small nsteps
values, speeding up drastically. The initial nsteps
should be reasonably high so there are no mistakes in the first few iterations, so 100-400 is what I usually use.
Thanks! I've been using the steps ampler a lot more, which has helped in the majority of cases.
Attached is a minimal working example with a dataset and Jupyter notebook to read and fit the data with UltraNest (without and with the stepsampler). This particular dataset example does not have a very strong correlation to start with, likely more so after bootstrapping allowing repeats. Would it be detrimental to the final result to set a finite max_ncalls
prior to running the sampler?
That should be fine, but you may get only one effective sample (and thus collapsing uncertainties) if it does not reach convergence.
I would try frac_remain=0.5
I am closing this issue, but please reopen if you still have problems with the latest version. I improved the warning to be more helpful.
Description
I am trying to use bootstrapping with UltraNest to test the effect of outliers on straight line fits to some data. To do this, the code resamples the original dataset allowing repeats, N times and fits with UltraNest each time. However, if a large enough portion of the resampled dataset are repeats, the fit doesn't finish, and I get the following warning about efficiency:
UserWarning: Sampling from region seems inefficient. You can try increasing nlive, frac_remain, dlogz, dKL, decrease min_ess). [0/40 accepted, it=2500] warnings.warn("Sampling from region seems inefficient. You can try increasing nlive, frac_remain, dlogz, dKL, decrease min_ess). [%d/%d accepted, it=%d]" % (accepted.sum(), ndraw, nit))
I've tried altering the arguments mentioned in the warning, and also Lepsilon, but the values I've tried haven't fixed the issue. I've also tried narrowing the initial priors on the slope and offset, but the allowed range had to be so small to not actually include the maximum posterior parameters derived from other fits to the data without duplicates.
Fixing the maximum number of duplicates allowed in the resampled dataset worked, but this might bias the interpretation from bootstrapping the data.
What I Did
(Here, df is a pandas dataframe).