joshspeagle / dynesty

Dynamic Nested Sampling package for computing Bayesian posteriors and evidences
https://dynesty.readthedocs.io/
MIT License
346 stars 76 forks source link

Can multiprocessing pools be used? #429

Closed shoepfl closed 1 year ago

shoepfl commented 1 year ago

Dynesty version 2.1.0, installed via pip

Hey, I am a littel confused. I tried to use multiprocessing to parallelize dynesty as there are also some examples in the issues here.

I use dynesty via pypesto to compile SBML models, my code is:

import pypesto
import pypesto.sample as sample
import multiprocessing
from pathlib import Path

M = 'synthetic_data_low_concentration'
Path = Path(__file__).parent / M
yaml_file = os.path.join(Path, M + '.yaml')
importer = pypesto.petab.PetabImporter.from_yaml(yaml_file)`
problem = importer.create_problem(force_compile=True)

# optimize for start values
result = optimize.minimize(problem, n_starts=1)

sampler = sample.DynestySampler(sampler_args=dynesty_args)
with multiprocessing.Manager() as manager:
    with manager.Pool() as pool:
        dynesty_args = {'nlive': 50, 'sample': 'unif',
                        'pool': pool, 'queue_size':multiprocessing.cpu_count(), 'bootstrap': 0}

        result = sample.sample(problem=problem,
                              sampler=sampler,
                               n_samples=None,
                               result=result)

However, I also gett a pickle error:

"AttributeError: Can't pickle local object 'DynestySampler.initialize..loglikelihood'"

I already implemented it with the manager of mulitprocessing (https://superfastpython.com/multiprocessing-pool-share-with-workers/) but still get the error.

Is multiprocessing supposed to work with dynesty? If not what are good arguments for pool? OpenMPI is not working for me as I am using it for Systems Biology together with amici which conflicts in many ways with OpenMPI

Thanks for any help in advance

segasai commented 1 year ago

You are not using the dynesty directly, but using something from pypesto instead. So I can't comment on that.

Also you if you are running multiprocessing you should put your code in

if __name__=='__main__':
    do_something()
shoepfl commented 1 year ago

Yes thats correct, the pypesto guys said that it should just pass the arguments to dynesty so I wanted to ask if multiprocessing is a valid argument for pool at all.

I will ty to use the dunder method

segasai commented 1 year ago

if after putting the code inside if name=='main' and using vanilla dynesty.Sampler() still produces the same issue, please write here.

shoepfl commented 1 year ago

Putting the code inside if name=='main' did not change anything, what do you mean with vanilla dynesty.Sampler?

segasai commented 1 year ago

I mean if you use just the sampler from dynesty, not the pypesto wrapper around it.

shoepfl commented 1 year ago

This is not possible as my model is in SBML and therefore cannot be passed to dynesty directly. If you have an example code for me I can test if the error persists also by using dynesty directly. The error message indicates that the error is only dependent on dynesty.

segasai commented 1 year ago

I can already see that the issue is caused by the wrapper, specifically by this internal function definition which is not pickleable:

https://github.com/ICB-DCM/pyPESTO/blob/13ce29bfd32a5170284f0d7049ccffc7d1ad7b5d/pypesto/sample/dynesty.py#L134

So the issue needs to be addressed there. Dynesty itself doesn't have the limitation and works perfectly fine with multiprocessing pool.

shoepfl commented 1 year ago

Okay thanks so this already helps a lot as I was wondering the whole time if dynesty is supposed to work with multiprocessing.

Then I will see how to solve it with the wrapper thanks.