Open segasai opened 1 year ago
Pinging @joshspeagle again.
Interesting. I don't know how I feel about benchmarking dlogz to be based on neff, since fundamentally it was designed to provide an upper bound on the remainder of the integral. The default choice in many cases was picked assuming some baseline tolerance of a few %, but this does ignore the fact that you almost always are recycling the final set of live points so you expect to be probing interior to the final threshold anyways.
Given that the default stopping criteria in some cases has been adjusted to depend on neff, this is probably a reasonable (and somewhat justified) choice. I would just think it's probably good to make sure that the initial run is able to probe far enough to note require another batch to try and sample beyond the final live point for people using the dynamic sampler (for the static sampler this should be fine).
Thanks,
I agree we certainly want the first run to probe "deep" enough in the posterior. My concern with the current behaviour is that it is
Okay, given your comments, let me code a function (while trying to be conservative), i.e. not to increase dlogz to say above 0.1 and then maybe we can take a look at it again.
Currently dlogz and dlogz_init values are taken out of thin air pretty much and they are often set to 0.01. But there is a way of motivating their choice.
The rationale there is the following. I'll assume we're sampling an N-dim Gaussian, with n live points and aim to have Neff samples. I'll also define as Z(r) as posterior volume within a ball radius r.
Given that we want $N{eff}$ samples, we want our inner point in the samples to satisfy approximately $Z(r{in}) = 1/N{eff}$. In the same time, given the live-points are uniformly distributed,the radius of the outermost point is $r{out}= n^{\frac{1}{N}} r{in}$. The remaining $\delta \log Z$ (in dynesty sense) for the outermost point is then $\log (1-Z(r{out}))$. Given that $Z(r) = IncGamma(\frac{N}{2}, \frac{r^2}{2})$ one can compute $\delta \log Z$ given $N_{eff}$, n live points and N dimensions. Here is the code doing this calculation:
For example for ndim=4, nlive=100, neff=100 that gives dlogz=0.6 ndim=100, nlive=100, neff=100 that gives dlogz=0.04 ndim=10, nlive=100, neff=10000 gives dlogz =0.005
This is a motivation in terms of Neff. I haven't thought about motivation in terms of logz accuracy. But presumably if we cap the neff to be larger than 100 in the calculation above that will guarantee that our innermost point will correspond to Z_in/Z_tot = 0.01 which should be good enough for good logz accuracy.
Thoughts @joshspeagle ?