joshspeagle / dynesty

Dynamic Nested Sampling package for computing Bayesian posteriors and evidences
https://dynesty.readthedocs.io/
MIT License
347 stars 76 forks source link

Fixes to bounding updates #428

Closed segasai closed 1 year ago

segasai commented 1 year ago

Fix #427

When the ellipsoid is being initialized as a bound, it's centred at zero and has unit variance, while it should be centred at 0.5 and have cov of ndim/4

coveralls commented 1 year ago

Pull Request Test Coverage Report for Build 4463250073


Changes Missing Coverage Covered Lines Changed/Added Lines %
py/dynesty/dynesty.py 6 8 75.0%
<!-- Total: 54 56 96.43% -->
Totals Coverage Status
Change from base Build 4298137003: 0.01%
Covered Lines: 4150
Relevant Lines: 4513

💛 - Coveralls
segasai commented 1 year ago

This is ready to get merged (it'd be great to get a review though).

The reason I'm not merging this yet is because when running test_resume.py one can see sometimes a slowdown when resuming the run (revealed when looking at #432 ). I think this means that some necessary information about bound update status is not fully propagated when saving/resuming.

Snippet from the interrupted run

475it [00:04, 81.38it/s, batch: 0 | bound: 0 | nc: 16 | ncall: 1197 | eff(%): 31.730 | loglstar:   -inf < -13.512 <    inf | logz: -17.612 +/-  0.102 | dlogz: 16.065 >  0.010]terminating
resuming
634it [00:00, 6329.26it/s, batch: 0 | bound: 0 | nc: 8 | ncall: 2375 | eff(%): 26.695 | loglstar:   -inf < -8.134 <    inf | logz: -12.384 +/-  0.102 | dlogz: 11267it [00:00, 1236.87it/s, batch: 0 | bound: 1 | nc: 2 | ncall: 11294 | eff(%): 11.218 | loglstar:   -inf < -0.917 <    inf | logz: -5.107 +/-  0.104 | dlogz: 1

as opposed to the non-interrupted run

468it [00:00, 4677.96it/s, batch: 0 | bound: 0 | nc: 3 | ncall: 1187 | eff(%): 31.473 | loglstar:   -inf < -13.867 <    inf | logz: -17.886 +/-  0.102 | dlogz: 936it [00:00, 2893.89it/s, batch: 0 | bound: 0 | nc: 9 | ncall: 6559 | eff(%): 13.646 | loglstar:   -inf < -2.738 <    inf | logz: -6.959 +/-  0.104 | dlogz:  31263it [00:00, 1620.68it/s, batch: 0 | bound: 1 | nc: 4 | ncall: 10713 | eff(%): 11.468 | loglstar:   -inf < -0.904 <    inf | logz: -5.014 +/-  0.103 | dlogz: 
1483it [00:00, 1354.41it/s, batch: 0 | bound: 2 | nc: 3 | ncall: 11127 | eff(%): 12.978 | loglstar:   -inf < -0.420 <    inf | logz: -4.539 +/-  0.103 | dlogz: 1

One cam see the difference in ncalls

and the interrupted/resumed run takes ~ 2 min as opposed to 10 sec. The issue seems to occur with the pool only.

Also it's not quite clear if this is really a new bug introduced with #428 or not, since the previous bounding code was somewhat obscure to me.

segasai commented 1 year ago

I've merged this. The issue with the slow down of test/resume is not connected to the patch, but instead just a result of starting the batch with logl value below the logl value of the first bound.

joshspeagle commented 1 year ago

Thanks so much for merging this in and tracking down the ncall differences! (Sorry for being slow with the review request -- the end of the semester has been a bit crazy!)

segasai commented 1 year ago

No worries. I've just created one additional bounding issue that may need addressing in the future.