Convergence testing with distance travelled

ColmTalbot commented 7 months ago

Reference issue

Motivated by discussion #365 and #366

What does you PR implement/fix ?

Hi @segasai, first of all I'm sorry for my extended absence on #365 and #366 I went down a deep rabbit hole related to the discussion in #365 and that combined with other work/life commitments didn't leave much time for implementing this. I think I've got to the bottom of this and if you're still willing I'd like to get back into this.

I've attached a draft describing a new test that uses the distance travelled in the MCMC as a diagnostic that is implemented in this branch.

I think this test and toy model problems we've discussed before are an excellent pair to put together a set of tests for potential changes to the sampling/MCMC methods.

nested_sampling_distance_test.pdf

I've included a new plot that shows the likelihood/distance insertion index

I've also been experimenting with plots that will show evolution of the KL divergence estimate, e.g.,

coveralls commented 7 months ago

Pull Request Test Coverage Report for Build 10101131308

Details

48 of 84 (57.14%) changed or added relevant lines in 4 files are covered.
3 unchanged lines in 1 file lost coverage.
Overall coverage decreased (-0.7%) to 90.984%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
py/dynesty/plotting.py	2	11	18.18%
py/dynesty/utils.py	2	29	6.9%
<!--	Total:	48	84	57.14%	-->

Files with Coverage Reduction	New Missed Lines	%
py/dynesty/sampling.py	3	93.06%
<!--	Total:	3		-->

Totals
Change from base Build 10031351920:	-0.7%
Covered Lines:	4188
Relevant Lines:	4603

💛 - Coveralls

segasai commented 7 months ago

Hi @ColmTalbot ,

Thanks for the updated patches. I will try to take a look at them in next couple of weeks ( a bit busy with teaching now)

 Sergey

segasai commented 6 months ago

Hi, @ColmTalbot

Thanks for the patches. I looked at them and I am certainly a fan of the logl insertion index statistic in particular. It would be great to have that convergence test.

I have however a suggestion of somewhat different implementation to the one you are proposing.The current implementation is in my opinion somewhat invasive, as it adds a bunch of keywords/variables specific to your convergence statistics that needs to be stored during sampling. However what I have in mind is to make sure we store enough general information during the sampling, so that the convergence statistics (like the ones you are proposing, or some other ones) can be calculated after the run.

In my understanding, if we, for example, store the iteration at which each point is added, i.e. On top of live_u, live_v, live_logl arrays in Sampler() we create the a live_additer or something array, that can track when each point was added, and that can the be propagated further when we take out dead points, and that will be saved in saved_run.RunRecord() In the case of static run, I think it is trivial to use that info to compute the logl order stastistic (and likely distance statistic) after the fact. In the case of dynamic run things are more complex, but I think doable (maybe more information needs to be stored about the run, but so be it).

Basically my thinking is that I'd prefer these convergence statistics to be computed after the run (outside the sampler) and we just need to ensure the relevant information is saved, so we can properly retrace the steps of the sampler.

Thanks S

ColmTalbot commented 6 months ago

I had initially hoped that this could be done in post-processing, but the distance check depends on the point used to start the MCMC chain, which would require modifying propose_point to also return this information. I'm happy to look into that, but it feels like it would add more overhead.

The likelihood test can (and is) done is post using currently available information. I'm not sure how they deal with dynamic mode.

One advantage to having these statistics be computed during the run is that for long-running tasks it can save significant time to know after a day that the convergence is bad, rather than waiting a week.

segasai commented 6 months ago

I agree that it can be useful to know some convergence statistics during the run. I would put that lower in priority for the moment. Regarding the distance travelled, I'm inclined to store in the RunRecord the proposed starting point for MCMC-based (rather than uniform) samplers. That again would enable computing the distance statistic in postprocessing.

joshspeagle / dynesty