Closed Jammy2211 closed 3 years ago
Sorry about the delay in responding -- the last week has been crazy on my end. Glad to hear the code has been working well for you! 🙂
To describe a bit more what the parallelization scheme is in dynesty in response to your questions:
Hope this helps!
Follow up question - my loglikelihood function has a lot of stuff in it (data, functions, etc). It is probably > 1GB in memory. Am I right in thinking that the implementation of multiprocessing.pool in Dynesty will essentially be passing this log likelihood function (with all the stuff it has in it) to every CPU every time a LH evaluation is made?
Im getting extremely slow performance using multiprocessing, and think this is the explanation. Just looking for confirmation!
Am I right in thinking that the implementation of multiprocessing.pool in Dynesty will essentially be passing this log likelihood function (with all the stuff it has in it) to every CPU every time a LH evaluation is made?
Yes, this is correct. You can get around this by instantiating some of the large objects separately in each member of the pool and then calling them from the likelihood function, but it can get pretty hack-y.
Great thanks, sounds like I've got a fun task for tomorrow!
For our use case, we have found that the DynestyStatic sampler with random walk mode is INCREDIBLE. It is a complete game changer to our science, so thank you!
Up to now, we have always used the sampler in serial mode, as our use case is such that we parallellize at a higher level and therefore spawn many jobs in serial. However, we now have a use case where the likelihood evaluation times are so long (30 seconds +) that the only way to make progress is to parallelize at the level of the non-linear search (hopefully, Dynesty). We have a lot of CPU time to throw at this!
For context, we typically apply the sampler with ~50 live points, rwalks=5 and Gaussian priors that are 'initialized' to overlap with the high likelihood regions of parameter space (we can determine this efficiently via fast non-linear searches).
I am looking for guideance for whether you think that parallelizing this job is simply a matter of use the
pool
feature in Dynesty, with the same settings as before. There are a few specifics it would be good to have clarity on:1) In MultiNest, when you parallelize the job, all parallel samples after the accepted samples are discarded, meaning that parallelization above 4 or so cores is pointless. Am I right in thinking this would not be the case for rwalk sampling in Dynesty?
2) Has the issue discussed here (https://github.com/joshspeagle/dynesty/issues/164) been addressed? Is there any update on this I should be aware of?
3) Are there any other bottlenecks I should be aware of that means that using 30-60 cores in parallel simply will not scale up in the way I am hoping?
Any general advise or guidance would be appreciated, if you think there is anything worth me knowing! My plan is to just go ahead and try it out for myself, but I suspect you can point me in the right direction to what I should be aware of :D.