Understanding nested sampler with parallelization

ja-vazquez / SimpleMC

Updated version of a simple MCMC code for cosmological parameter estimation where only expansion history matters.

GNU General Public License v2.0

23 stars 15 forks source link

Understanding nested sampler with parallelization #42

Closed camarman closed 1 year ago

camarman commented 1 year ago

Let's suppose we have two models and two datasets given in the paraltest.py

datasets = ["SN", "BBAO+HD"]
models = ["LCDM", "owaCDM"]

What is difference between running

Run first LCDM + SN with 1 core Later run owaCDM_BBAO+HD with 1 core

Run first LCDM+SN with 2 core Later run owaCDM_BBAO+HD with 2 core

Run LCDM_SN and owaCDM_BBAO+HD with 2 cores by using mpi

Which option is better if we want to compare the two models ?

igomezv commented 1 year ago

The use of mpi for parallelization repeats the same process on different cores; in the case of MCMCAnalyzer it works because its stopping criterion is the Gelman-Rubin diagnostic that uses two or more chains to determine the similarity between them.

In the case of nested sampling, mpi is not effective because the stopping criterion is the difference of Bayesian evidence between two consecutive samples; so the parallelization used consists of distributing the calculation of likelihood and prior functions over the batch of live points at the same time; in dynesty, this task is performed with the multiprocessing library.

For model comparison the relevant value is Bayesian evidence (logZ), and the number of cores only affect in the excecution time. Only be careful in use the same number of live points, stopping criterion and the same prior range in the shared parameters of the two models.

For model comparison the relevant value is the Bayesian evidence (logZ), the number of cores only affects the runtime, but not the parameter estimation. Just be careful to use the same number of live points, stopping criterion for both models, as well as to use the same priors on the shared parameters of the two models.

camarman commented 1 year ago

I see so simply in nested sampling the # core only effects the runtime but not the analysis itself by any mean (?)

igomezv commented 1 year ago

Yes, that's right.

camarman commented 1 year ago

Okay, thanks a lot