Closed sblunt closed 4 years ago
Totals | |
---|---|
Change from base Build 1206: | 0.03% |
Covered Lines: | 1034 |
Relevant Lines: | 1180 |
I think the new emcee you have to make your own process poll and pass it in. Like this:
import multiprocessing as mp
pool = mp.Pool(self.num_threads)
sampler = emcee.EnsembleSampler(nwalkers, ndim, lnprob, pool=pool)
I think if we create the pool ourselves, then it should work.
I think it's a little more complicated, unfortunately. We're using sampler.sample()
to run MCMC. This method returns a State
object when you're running emcee
in serial, but returns a different kind of object (without a "current coordinates" attribute) when passing in the pool object the way you suggested. I don't see an easy way to get the current coordinates out of the generator object returned by the pool sampling, which means we'd have to restructure the API a bit to run parallel emcee
the way we're currently running single-threaded emcee
and ptemcee
. Totally possible, but seemed unnecessary to me. Let me know what you think.
Ok, I don't think we need to do this now, but I think in the future, we'll need to separate the code to call ptemcee
vs emcee
sampler, because the API is bifurcating significantly (basically one chunk of code to call one, and one chunk of code to call the other).
I've used the new emcee with parallelization and this is how I do a burn-in of 500 steps + 1000 steps of sampling:
sampler.run_mcmc(p0, 1500, progress=True)
flat_samples = sampler.get_chain(discard=500, flat=True)
Totally, sounds good! Thanks Jason.
Addresses #141 . I ended up not implementing parallel processing bc I think most of our users use
ptemcee
anyway. Added a warning message that encourages users to raise an issue if they want parallel processing withemcee
.