uqfoundation / pathos

parallel graph management and execution in heterogeneous computing
http://pathos.rtfd.io
Other
1.38k stars 89 forks source link

why having to re-import locally an already globally imported module? #161

Open CyBu opened 5 years ago

CyBu commented 5 years ago
import numpy as np

class Numbers():

    def __init__(self):
        pass

    def make_random_number(self, n=1):      # <--- n=1 does not do anything, but wont work without it
        import numpy as np                             # <---   why re-import locally ?
        return np.random.randn()

    def make_n_random_numbers(self, n):
        from pathos.multiprocessing import ProcessingPool as Pool
        pool = Pool().map
        result = pool(self.make_random_number, range(n))
        return result

A = Numbers() A.make_n_random(5) [0.31278970370730214, -0.12968823531781395, -0.6845478093975814, -1.9751374096307035, -0.7172900056712331]

mmckerns commented 5 years ago

Think about it like this... you are building a function object out of make_random_number, and that object is getting passed to a new processor. If you don't have a fully encapsulated namespace, then there are dangling pointer references... which can lead to a NameError or other errors. Translation: the name np is undefined on the new processor where your function is running. You either have to import numpy as np locally, or pathos has to serialize and ship the entire numpy module (or at least know to import it as np), so that when your code tries to look up np in the namespace, it finds something.