Closed segasai closed 3 years ago
Changes Missing Coverage | Covered Lines | Changed/Added Lines | % | ||
---|---|---|---|---|---|
py/dynesty/results.py | 45 | 52 | 86.54% | ||
py/dynesty/utils.py | 21 | 28 | 75.0% | ||
<!-- | Total: | 66 | 80 | 82.5% | --> |
Totals | |
---|---|
Change from base Build 1232247582: | 0.3% |
Covered Lines: | 3425 |
Relevant Lines: | 4173 |
I see there are some unresolved comments related to the results structure that is being saved. I've added some additional info below.
samples_it
: the sampling iteration when the sample was proposed (e.g., iteration 570)samples_id
: the unique ID of the sample (e.g., ID=2)These are primarily useful for "winding back" sampling if we want to see what the live points are doing at some intermediate time during the run, such as in boundplot
.
samples_n
: the number of live points at the point when the sample was proposed (e.g. K=1957)This is mainly useful for dynamic nested sampling.
samples
: the location (in original coordinates). Should be identical to samples_v
I think.eff
: the overall sampling efficiency (niter
/ncall
, more or less)These are just summary aliases, IIRC.
bound
: the set of bounding objects used to condition proposalsbound_iter
: the iteration when the corresponding bound was created to propose new live points (e.g., iteration 520)samples_bound
: the index of the bound that the corresponding sample was drawn from (e.g., bound 1)These are useful when using plotting functions like cornerbound
, since they allow us to track particular bounds. I also left them in in case I wanted to tackle adding something like Importance Nested Sampling into the code, which would require knowing which sample came from which bound.
samples_batch
: tracks the batch during which the samples were proposedbatch_bounds
: the log-likelihood bounds used to sample points in a given batchbatch_nlive
: the number of live points added in a given batchThese are mostly additional quantities I added to make sure I could associate points with batches and track that the batches were being added correctly.
Let me know if I missed anything.
I've fixed some XXs, but some questions still remain remain regarding batch quantities and lengths (i.e. whether something has a lenght niter or not). It'd be good for you to edit this directly. Also I don't quite understand what's the difference between batch_nlive vs samples_n and samples_bound vs batch_bound. It'd be good to clarify that. Maybe even rename some of them for consistency.
Also _merge_two() function in the utils.py mentions prop_iter, samples_prop and prop properties, but those do not seem to be used anywhere. I assume those are dead ?
The reason why I think this needs cleaning up, is because resutl is an externally returned object, as opposed to RunRecord, so it would be good to ensure, it is fully understood/described what each of those are. Plus one can establish some kind of tests that verify that arrays have correct lengths etc. ATM I'm not sure of that. Also I was potentially thinking of storing the information about proposed point for rwalk/rslice, so one could compute the degree of correlatedness between samples, but I wouldn't want to do that before the results object is fully described.
I've fixed some XXs, but some questions still remain remain regarding batch quantities and lengths (i.e. whether something has a lenght niter or not). It'd be good for you to edit this directly.
That's easy enough for me to do post-merge. No problem.
Also I don't quite understand what's the difference between batch_nlive vs samples_n and samples_bound vs batch_bound. It'd be good to clarify that. Maybe even rename some of them for consistency.
samples_n
tracks the number of effective live points at every iteration, e.g., [180, 179, 178, 177, 190, 190, 190, 195, ...]. batch_nlive
just checks how many live points we added in from a particular batch, e.g., 250. samples_bound
is the associated bounding distribution, whereas batch_bound
is the logl_bound
tuple. I agree that renaming both quantities (esp. the second) would be helpful, especially since I don't think they are currently used elsewhere in the code.
Also _merge_two() function in the utils.py mentions prop_iter, samples_prop and prop properties, but those do not seem to be used anywhere. I assume those are dead ?
These were renamed to bound
. I can try to update this as well when I go through the internal docstrings.
Also I was potentially thinking of storing the information about proposed point for rwalk/rslice, so one could compute the degree of correlatedness between samples, but I wouldn't want to do that before the results object is fully described.
This would be great and get to addressing (in part) #163, since then we would have the possible option of using that info for tuning the number of walks/slices on the fly.
There's a small conflict with the main branch I need to double-check on, but I will try to merge this in tonight and then get around to cleaning everything up in preparation for the v1.2 release.
Thanks. The conflict can be resolved completely in favor of the results_branch.
I think this probably closes the #175 and #186 too
Agreed. I'll tentatively say this:
Here's a working preliminary implementation of the Results() that is more than just a dictionary. It still accepts the list of key values or a dictionary, but prevents manual updates, and adds some additional methods