[Request] example of RNG using numpy in parallel

gideonsimpson commented 4 years ago

Could we get a minimal example of how to safely/properly perform random number generation using the pathos multiprocessing module (with NumPy)? I've seen different answers, including passing distinct seeds as additional arguments and passing distinct realizations of the random number generator. The case that I have in mind is for running a function f(x) on many different inputs, x, using the map function with a process pool. Here, f involves random number generation.

mmckerns commented 1 year ago

Here's a minimal example in which we pass the random_state generator to the mapped processes.

>>> import pathos.pools as pp
>>> from pathos.helpers import mp_helper as mp
>>> p = pp.ProcessPool()
>>> p.map(lambda x: mp.random_state('numpy.random').randn() + x, [0,0,0,0])
[-1.182340000425915, -0.6958222183538095, -1.491861526240052, 0.7176480875799546]
>>> p.close(); p.join(); p.clear()

If we don't construct the random state for the random number generator, then multiprocess often produces the same seed (based on the current time) when random is called in parallel. Here's the wrong way to do it:

>>> import pathos.pools as pp
>>> import numpy as np
>>> p = pp.ProcessPool()
>>> p.map(lambda x: np.random.randn() + x, [0,0,0,0])
[-1.0382938460666182, -1.0382938460666182, -1.0382938460666182, -1.0382938460666182]
>>> p.close(); p.join(); p.clear()

You can also use the random state generator from the random module in the standard library:

>>> import pathos.pools as pp
>>> from pathos.helpers import mp_helper as mp
>>> p = pp.ProcessPool()
>>> p.map(lambda x: mp.random_state('random').random() + x, [0,0,0,0])
[0.5115375052641259, 0.8660463155953868, 0.9085464421107288, 0.07722073763171311]
>>> p.close(); p.join(); p.clear()

mmckerns commented 1 year ago

added test/example in 0822da123923d19c990c288a8c6938c1b2ccb98c

uqfoundation / pathos

[Request] example of RNG using numpy in parallel #180