Closed gideonsimpson closed 1 year ago
Here's a minimal example in which we pass the random_state
generator to the mapped processes.
>>> import pathos.pools as pp
>>> from pathos.helpers import mp_helper as mp
>>> p = pp.ProcessPool()
>>> p.map(lambda x: mp.random_state('numpy.random').randn() + x, [0,0,0,0])
[-1.182340000425915, -0.6958222183538095, -1.491861526240052, 0.7176480875799546]
>>> p.close(); p.join(); p.clear()
If we don't construct the random state for the random number generator, then multiprocess
often produces the same seed (based on the current time) when random is called in parallel. Here's the wrong way to do it:
>>> import pathos.pools as pp
>>> import numpy as np
>>> p = pp.ProcessPool()
>>> p.map(lambda x: np.random.randn() + x, [0,0,0,0])
[-1.0382938460666182, -1.0382938460666182, -1.0382938460666182, -1.0382938460666182]
>>> p.close(); p.join(); p.clear()
You can also use the random state generator from the random
module in the standard library:
>>> import pathos.pools as pp
>>> from pathos.helpers import mp_helper as mp
>>> p = pp.ProcessPool()
>>> p.map(lambda x: mp.random_state('random').random() + x, [0,0,0,0])
[0.5115375052641259, 0.8660463155953868, 0.9085464421107288, 0.07722073763171311]
>>> p.close(); p.join(); p.clear()
added test/example in 0822da123923d19c990c288a8c6938c1b2ccb98c
Could we get a minimal example of how to safely/properly perform random number generation using the pathos multiprocessing module (with NumPy)? I've seen different answers, including passing distinct seeds as additional arguments and passing distinct realizations of the random number generator. The case that I have in mind is for running a function
f(x)
on many different inputs,x
, using themap
function with a process pool. Here,f
involves random number generation.