compatibility with emcee>=3

sblunt / orbitize

Orbit-fitting for directly imaged objects

https://orbitize.info

Other

79 stars 44 forks source link

compatibility with emcee>=3 #185

Closed sblunt closed 4 years ago

sblunt commented 4 years ago

Addresses #141 . I ended up not implementing parallel processing bc I think most of our users use ptemcee anyway. Added a warning message that encourages users to raise an issue if they want parallel processing with emcee.

coveralls commented 4 years ago

Pull Request Test Coverage Report for Build 1216

11 of 11 (100.0%) changed or added relevant lines in 1 file are covered.
No unchanged relevant lines lost coverage.
Overall coverage increased (+0.03%) to 87.627%

Totals
Change from base Build 1206:	0.03%
Covered Lines:	1034
Relevant Lines:	1180

💛 - Coveralls

semaphoreP commented 4 years ago

I think the new emcee you have to make your own process poll and pass it in. Like this:

import multiprocessing as mp
pool = mp.Pool(self.num_threads)
sampler = emcee.EnsembleSampler(nwalkers, ndim, lnprob, pool=pool)

I think if we create the pool ourselves, then it should work.

sblunt commented 4 years ago

I think it's a little more complicated, unfortunately. We're using sampler.sample() to run MCMC. This method returns a State object when you're running emcee in serial, but returns a different kind of object (without a "current coordinates" attribute) when passing in the pool object the way you suggested. I don't see an easy way to get the current coordinates out of the generator object returned by the pool sampling, which means we'd have to restructure the API a bit to run parallel emcee the way we're currently running single-threaded emcee and ptemcee. Totally possible, but seemed unnecessary to me. Let me know what you think.

semaphoreP commented 4 years ago

Ok, I don't think we need to do this now, but I think in the future, we'll need to separate the code to call ptemcee vs emcee sampler, because the API is bifurcating significantly (basically one chunk of code to call one, and one chunk of code to call the other).

I've used the new emcee with parallelization and this is how I do a burn-in of 500 steps + 1000 steps of sampling:

sampler.run_mcmc(p0, 1500, progress=True)
flat_samples = sampler.get_chain(discard=500, flat=True)

sblunt commented 4 years ago

Totally, sounds good! Thanks Jason.