joshspeagle / dynesty

Dynamic Nested Sampling package for computing Bayesian posteriors and evidences
https://dynesty.readthedocs.io/
MIT License
347 stars 76 forks source link

Broken Monte-Carlo volume calculations #398

Closed segasai closed 1 year ago

segasai commented 1 year ago

I believe that all the monte_carlo_logvol()/mc_integrate functions inside bounding.py are broken.

The reason is everywhere the volumes are computed this kind of formula Vestim = Vsum/E[q] where the E[q] is average q when sampled over multiple subvolumes (q is the number of subvolumes associated with each point). And Vsum is the sum of all subvolumes. I.e. here : https://github.com/joshspeagle/dynesty/blob/cab55c3d77f9ea301094260185f59cc3829c4ad3/py/dynesty/bounding.py#L1019 (the same eqn is used for all bounds)

I can't really see where this equation is coming from, but the correct expression should be this I believe:

Vestim = E[1/q] * Vsum

The reason is if I1(x) I2(x) ... In(x) are indicators associated with volumes V1 V2 Vn Then the volume integral

$$ Volume = \int (I_1>0) | (I_2>0 ) | (I_3 > 0) ..... dx = \int \frac{(I_1>0) | (I_2>0 ) | (I_3 > 0) .....}{I_1+I_2 + ... + I_n} (I_1+ .... I_n) dx$$

$$ Volume = \sum_i V_i \int \frac{(I1>0) | (I2>0 ) | (I3 > 0) ....}{I_1+I_2 + ... + I_n} \frac{I_1+ .... I_n}{\sum_i V_i} dx $$

$$ Volume \approx (\sum_i V_i) E[1/q]$$

subject to points being sampled uniformly across all the subvolume in proportion to its size

Demonstration in 2d with rectangles where the answer is clearly wrong (the correct volume should be 6)

In [21]: import dynesty.bounding as db

In [22]: np.exp(X.monte_carlo_logvol([[0.,0.],[1,0]],rstate=np.random.default_rng(1),ndraws=100000))
Out[22]: 5.335752207667475

Currently the monte_carlo_logvol is not used for anything, it seems, so maybe it is all not needed, but I am wondering when fixed it maybe could be used as a safety check of the volumes

segasai commented 1 year ago

399 fixes the issue