minaskar / zeus

⚡️ zeus: Lightning Fast MCMC ⚡️
https://zeus-mcmc.readthedocs.io/
GNU General Public License v3.0
225 stars 34 forks source link

Poor parallelisation scaling #32

Open tilmantroester opened 1 year ago

tilmantroester commented 1 year ago

I found that running zeus with n_MPI = n_walker/2 gives poor efficiency, with half of the CPU time being spend idling.

If I understand the code in EnsembleSampler.sample correctly, the stepping-out procedure it repeated until all walkers in the ensemble have reached their step-out position, and only then the shrinking procedure begins. This means that the ensemble has to wait until the last walker reaches the step-out position, during which all other walkers are idling. Please correct me if I misunderstood the implementation.

Since the the stepping-out and shrinking procedures are independent for each walker once the directions are set, it should be possible to restructure the loops such that walkers can start shrinking as soon as they finished stepping out, rather than having to wait for the last walker.

On a somewhat unrelated note, is there a reason the maxsteps are distributed randomly between left and right here: https://github.com/minaskar/zeus/blob/master/zeus/ensemble.py#L566 ?