Closed cdcapano closed 7 years ago
@@ master #12 diff @@
==========================================
Files 7 7
Lines 656 681 +25
Methods 0 0
Messages 0 0
Branches 0 0
==========================================
+ Hits 250 251 +1
- Misses 406 430 +24
Partials 0 0
Powered by Codecov. Last update 4616db6...6a5a124
Closing this. Not sure it's the correct thing to do, plus PR #13 would require some changes to this. May revisit later.
This patch allows one to set the
kde_size
<nwalkers
. In that case, multiple kdes are created, with subsets of walkers assigned to each kde. For example, ifnwalkers
is 5000 andkde_size
is 500, 10 kdes are created, with the first one using the first 500 walkers, the next 500 the second, etc. Doing this necessitates some changes to how Sampler stores kdes. Now, _kde is a list of kdes. Also, max_samples cannot be passed when calling run_mcmc. Instead,kde_size
is set once when the sampler class is initialized.The reason for doing this is I've found that calls to the kde can become fairly slow when using a large number of walkers. Here's a plot of the amount of time it takes to call the kde (measured by sticking time.time() calls just before and after the kde is called in
_GetLnProbWrapper
) using 5000 walkers (note: you need LVC credentials to view): https://www.atlas.aei.uni-hannover.de/~cdcapano/LSC/projects/pycbc_inference/cbc/gw150914/kombine/5kwalkers_1kiterations/kde-all_timing.pngThe plot shows the min, max, and mean averaged over all of the walkers at each iteration during burn-in; dashed line shows the update intervals. Here is the same plot after this patch, setting the kde size to 500:
https://www.atlas.aei.uni-hannover.de/~cdcapano/LSC/projects/pycbc_inference/cbc/gw150914/kombine/5kwalkers_1kiterations-500kde_pts/kde-all_timing.png
Here is the resulting posteriors on GW150914 from this later run:
https://www.atlas.aei.uni-hannover.de/~cdcapano/LSC/projects/pycbc_inference/cbc/gw150914/kombine/5kwalkers_1kiterations-500kde_pts/posteriors-scatter.png
I don't think it's necessary to make the resolution of the kde be the same as the number of walkers, just that the draws from a proposal distribution be independent for each walker. This does mean that each subset of walkers sees a slightly different proposal distribution. I think that should be ok, but I'm not an expert on MCMCs, so please correct me if I'm wrong.
I'm also new to this package. I tried to match coding styles; apologies if I've messed something up.