joshspeagle / dynesty

Dynamic Nested Sampling package for computing Bayesian posteriors and evidences
https://dynesty.readthedocs.io/
MIT License
347 stars 76 forks source link

Logic of update_interval #290

Closed segasai closed 3 years ago

segasai commented 3 years ago

While sampling the 128d gaussian I noticed that the boundaries are updated quite often which lead me to look at the update_interval

https://github.com/joshspeagle/dynesty/blob/6843ca1e0df26410679cf39cb2ce43695782698f/py/dynesty/dynesty.py#L426

And I was wondering what the logic is there. Especially given the claim: " Default behavior is to target a roughly constant change in prior volume, with" Because if that's the case then the update_interval should be proportional to the number of live points. As the prior volume change per step is ~ -1/K.

joshspeagle commented 3 years ago

It is proportional to the number of live points if the input is a float (which most of the defaults are). See a few lines down:

https://github.com/joshspeagle/dynesty/blob/6843ca1e0df26410679cf39cb2ce43695782698f/py/dynesty/dynesty.py#L442

segasai commented 3 years ago

Yes, I saw that second line, but it is not the default! (most users will probably never try to change it, since it's a reasonably obscure for them parameter)
In my opinion defaults need to be the most sensible, and it was not clear what was the rationale for current defaults. IMO, they are either need to be changed or justified. If we think that the floating update interval is better, we should make it as a default.

joshspeagle commented 3 years ago

I'm confused -- a float is the default option in all cases if no value is passed. Can you please highlight which of the default options is not a float?

segasai commented 3 years ago

ah, I missed that first this https://github.com/joshspeagle/dynesty/blob/d16e716e1791efab859349d0363aff9fd8af88fa/py/dynesty/dynesty.py#L422 logic is applied and that makes them floating poitns and then the second branch is executed. I'll refactor it a bit then...

segasai commented 3 years ago

Actually the dynamic sampler does not have the

        update_interval = max(1, round(update_interval * nlive))

code

joshspeagle commented 3 years ago

I think there those are applied elsewhere such as https://github.com/joshspeagle/dynesty/blob/d16e716e1791efab859349d0363aff9fd8af88fa/py/dynesty/dynamicsampler.py#L1126

I don't remember the logic behind why I did that.

segasai commented 3 years ago

okay. I think the logic probably needs to be unified, as it seems a bit spread out, but it's probably not a big deal. I just was surprised when I was doing the 128dim gaussian with 5000 points that I had like 100k bounds, so I felt that something was wrong. I think we need a function like get_update_interval(ndim, sample, bound)

segasai commented 3 years ago

But also, the question is why does update_interval needs to depend on the function evaluations not on iterations ?

segasai commented 3 years ago

I think there those are applied elsewhere such as

I don't remember the logic behind why I did that.

It's possibly because in the dynamic sampler the number of livepoints can change and is not defined when the sampler is initialized

joshspeagle commented 3 years ago

It's possibly because in the dynamic sampler the number of livepoints can change

Ahhhhh, yes. That would be why.

But also, the question is why does update_interval needs to depend on the function evaluations not on iterations ?

It was designed to scale with the expected sampling efficiency for various sampling methods. So ideally you trigger after some m * K iterations have passed, where K is the number of live points and m some constant based on how much of the volume you want to traverse between updates. However, you sometimes want to trigger immediately if you've constructed a bad bound and proposed a bunch of positions. So this is why I have it scale with number of likelihood evaluations instead rather than number of iterations. That scaling then should go as m / s * K, where s is the sampling efficiency. So m / s kinda determines the constant that I chose, which was largely set by a set of tests I ran on a few test distributions.