jakeret / abcpmc

Approximate Bayesian Computation Population Monte Carlo
GNU General Public License v3.0
39 stars 19 forks source link

Same seed in Multiprocessing for numpy #13

Open jakgel opened 7 years ago

jakgel commented 7 years ago

Hello, I encountered inconsistent behaviour for random processes within the generator when using multiprocessing for abcpmc.Sampler(..., postfn = testrand, ... )

`

def testrand():
    import random
    import numpy as np
    random.randrange(1000)                   # Is random
    np.random.randint(1000, size=1)       # yields same results for every process
    np.random.poisson(1000,10)              # yields same results for every process
    return 0

`

One possible workaround is to use random to set a random seed for numpy within the function:

seed = random.randrange(4294967295) np.random.seed(seed=seed)

however this is bulky and might confuse some users that might not know about this behaviour. Could you please consider adapting abcpmc to also exhibit the random behavior for numpy.random?

Thanks jakgel

jakeret commented 7 years ago

I has been a while but this somehow rings a bell. Which version are you using?

Could be that the package on PYPI is slightly outdated. The dev version here on GitHub has a fix for that

jakgel commented 7 years ago

You are correct that PYPI is slightly outdated, yet I work with the update you suggested and running testrand(...) as generator in abcpmc produces the same non-random results in the 'numpy.random' case. The module 'random' however is fine.

`

 import numpy as np
 from __future__ import division, print_function
 def testrand(notused, randomseed=False):
      import random
      if randomseed:
         seed = random.randrange(4294967295)
         np.random.seed(seed=seed)      
         print("Seed was:", seed)
     print(np.random.poisson(4,5))

     return 0

'

I theorize that this behaviour is based on the fact that the seeds of 'random' and 'numpy.random' are both seperately initialized. In addition different methods are used for both modules (compare http://forum.cogsci.nl/index.php?p=/discussion/1441/solved-numpy-random-state-seems-to-repeat-across-multiple-os-runs )