Closed ruskin23 closed 4 years ago
So the parallelization in dynesty
is extremely simple: it just calls pool.map
instead of the default Python map
whenever possible when evaluating points. This works under the assumption that the log-likelihood dominates the runtime, which isn't always true. In the cases where it doesn't, the overhead involved with sending out operations to multiple cores to immediately re-collect them can be quite prohibitive, which appears to be the problem here.
If you want to control some of this behavior, you can do so using the use_pool
argument. This allows you to control exactly what operations are done using the pool
and what can just be done on the main core. I've copy-pasted the documentation below to highlight what options you have to play with:
use_pool: dict, optional A dictionary containing flags indicating where a pool should be used to execute operations in parallel. These govern whether prior_transform is executed in parallel during initialization ('prior_transform'), loglikelihood is executed in parallel during initialization ('loglikelihood'), live points are proposed in parallel during a run ('propose_point'), and bounding distributions are updated in parallel during a run ('update_bound'). Default is True for all options.
Let me know if this helps.
Closing this for now.
Hello again
I was trying to understand where does parallelization is helping while proposing a new point. As I understand, you fill a queue using pool.map (here) with a live point proposed using a sampling function. I can see that besides the uniform sampling function, all the other sampling functions are already conditioned with logl_prop>loglstar. So when the queue is filled do all the points in the queue have loglikelihoods better than the current worst live point (loglstar) except in the case of uniform sampling?
I am trying to use dynesty for my project but I am having trouble parallelizing the processes.
This is a simplified version of my code. The likelihood is calculated from two uncorrelated Gaussian distributions and I have defined both uniform and non-uniform priors over my parameters:
Initiating 4 engines using ipcluster I get following summary:
However if I dont parallelize and just run sampler:
So my question is how dynesty is parallelizing the calculations because I can see the total time taken for the parallel processes is much greater than the serial process? Is there something wrong with my setup?