hvasbath / beat

Bayesian Earthquake Analysis Tool
GNU General Public License v3.0
128 stars 42 forks source link

some thoughts about "chunksize" in iter_parallel_chains function of beat/sampler/base.py #86

Closed ranneylxr closed 1 year ago

ranneylxr commented 2 years ago

Hi again, In iter_parallel_chains function of beat/sampler/base.py:476-482

        if chunksize is None:
            if draws < 10:
                chunksize = int(np.ceil(float(n_chains) / n_jobs))
            elif draws > 10 and tps < 0.5:
                chunksize = int(np.ceil(float(n_chains) / n_jobs))
            else:
                chunksize = n_jobs

the tps seems to depend on hardware(I have installed libamdm), and if we set a bigger n_jobs, the chunksize will also be bigger when case tps > 0.5 and draws > 10 and stage > 0.

Refering https://docs.python.org/3/library/multiprocessing.html#multiprocessing.pool.Pool.map, the bigger chunksize leads to the smaller chunks count. when n_job > chunks count, the bigger n_job will decrease the number of parallels, which means the calculation time gets longer.

Is it correct? And can I set a arbitory chunksize in script manually? Thank you!

hvasbath commented 2 years ago

Hi again,

cool that you are still around ;) . You are right. The intention behind that is, if your forward model takes a long time, you want to rather use a small chunksize, i.e. having the work distributed in smaller chunks to more workers, otherwise it often happens you have a single worker left with a big chunk of work, that all the other workers are waiting for to be finished until entering the next stage. Vice versa if you have a fast forward modell you want to have a big chunk-size, because initialising the worker then takes longer than the sampling itself. Is that understandable? Now I couldnt completely understand what your problem with that setup is. For now you cannot define chunksize in the config file, but if it would help you- we can surely add that- it is not a big deal.

Cheers!

ranneylxr commented 2 years ago

I understand it! Thank you for explaining.

Best regards.

hvasbath commented 1 year ago

Sorry for the late fixing, but I apparently didnt get the point correctly until I tried myself with larger number of chains. It is fixed in the current dev branch here: https://github.com/hvasbath/beat/pull/121 and should be released to master soon.