Closed segasai closed 1 year ago
Changes Missing Coverage | Covered Lines | Changed/Added Lines | % | ||
---|---|---|---|---|---|
py/dynesty/dynesty.py | 6 | 8 | 75.0% | ||
<!-- | Total: | 54 | 56 | 96.43% | --> |
Totals | |
---|---|
Change from base Build 4298137003: | 0.01% |
Covered Lines: | 4150 |
Relevant Lines: | 4513 |
This is ready to get merged (it'd be great to get a review though).
The reason I'm not merging this yet is because when running test_resume.py one can see sometimes a slowdown when resuming the run (revealed when looking at #432 ). I think this means that some necessary information about bound update status is not fully propagated when saving/resuming.
Snippet from the interrupted run
475it [00:04, 81.38it/s, batch: 0 | bound: 0 | nc: 16 | ncall: 1197 | eff(%): 31.730 | loglstar: -inf < -13.512 < inf | logz: -17.612 +/- 0.102 | dlogz: 16.065 > 0.010]terminating
resuming
634it [00:00, 6329.26it/s, batch: 0 | bound: 0 | nc: 8 | ncall: 2375 | eff(%): 26.695 | loglstar: -inf < -8.134 < inf | logz: -12.384 +/- 0.102 | dlogz: 11267it [00:00, 1236.87it/s, batch: 0 | bound: 1 | nc: 2 | ncall: 11294 | eff(%): 11.218 | loglstar: -inf < -0.917 < inf | logz: -5.107 +/- 0.104 | dlogz: 1
as opposed to the non-interrupted run
468it [00:00, 4677.96it/s, batch: 0 | bound: 0 | nc: 3 | ncall: 1187 | eff(%): 31.473 | loglstar: -inf < -13.867 < inf | logz: -17.886 +/- 0.102 | dlogz: 936it [00:00, 2893.89it/s, batch: 0 | bound: 0 | nc: 9 | ncall: 6559 | eff(%): 13.646 | loglstar: -inf < -2.738 < inf | logz: -6.959 +/- 0.104 | dlogz: 31263it [00:00, 1620.68it/s, batch: 0 | bound: 1 | nc: 4 | ncall: 10713 | eff(%): 11.468 | loglstar: -inf < -0.904 < inf | logz: -5.014 +/- 0.103 | dlogz:
1483it [00:00, 1354.41it/s, batch: 0 | bound: 2 | nc: 3 | ncall: 11127 | eff(%): 12.978 | loglstar: -inf < -0.420 < inf | logz: -4.539 +/- 0.103 | dlogz: 1
One cam see the difference in ncalls
and the interrupted/resumed run takes ~ 2 min as opposed to 10 sec. The issue seems to occur with the pool only.
Also it's not quite clear if this is really a new bug introduced with #428 or not, since the previous bounding code was somewhat obscure to me.
I've merged this. The issue with the slow down of test/resume is not connected to the patch, but instead just a result of starting the batch with logl value below the logl value of the first bound.
Thanks so much for merging this in and tracking down the ncall differences! (Sorry for being slow with the review request -- the end of the semester has been a bit crazy!)
No worries. I've just created one additional bounding issue that may need addressing in the future.
Fix #427
When the ellipsoid is being initialized as a bound, it's centred at zero and has unit variance, while it should be centred at 0.5 and have cov of ndim/4