Closed dsanalytics closed 4 years ago
BTW: calling freeze_support gives the same error - v2 of the code below
def testf(x):
return testf2(x)
def testf2(x):
return(x)
def process_all():
import dill
import pathos
from pathos.multiprocessing import Pool
pool = Pool()
# out2 = pool.map(testf2, range(10)) # this works - prints [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
out2 = pool.map(testf, range(10)) # this gives 'NameError: name 'testf2' is not defined'
pool.close()
print(out2)
if __name__ == '__main__':
import pathos
pathos.helpers.freeze_support()
process_all()
Any update? I have the same issue.
I am experiencing the same issue when the function I supply calls out to a different Python module (search_solr.py
).
File "D:\Programming\Python\3.7.2\lib\site-packages\pathos\multiprocessing.py", line 137, in map
return _pool.map(star(f), zip(*args)) # chunksize
File "D:\Programming\Python\3.7.2\lib\site-packages\multiprocess\pool.py", line 268, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "D:\Programming\Python\3.7.2\lib\site-packages\multiprocess\pool.py", line 657, in get
raise self._value
NameError: ("name 'search_solr' is not defined", 'occurred at index 0')
It works if you put the functions as methods:
class parallel():
def foo(self, x):
return x
def bar(self, x):
return self.foo(x)
if __name__ == '__main__':
p = parallel()
pool = pathos.multiprocessing.Pool(processes=2)
print(pool.map(p.bar, [0, 1]))
C:\Users\jeckz\PycharmProjects\pin\venv\Scripts\python.exe C:/Users/jeckz/PycharmProjects/pin/venv/pathostest.py
[0, 1]
Process finished with exit code 0
However, if you try to import a package which you want to use within one of the methods for instances:
import pathos
import math
class parallel():
def foo(self, x):
return math.asin(x)
def bar(self, x):
return self.foo(x)
if __name__ == '__main__':
p = parallel()
pool = pathos.multiprocessing.Pool(processes=2)
print(pool.map(p.bar, [0, 1]))
I receive:
The above exception was the direct cause of the following exception: Traceback (most recent call last): File "C:/Users/jeckz/PycharmProjects/pin/venv/pathostest.py", line 14, in <module> print(pool.map(p.bar, [0, 1])) File "C:\Users\jeckz\PycharmProjects\pin\venv\lib\site-packages\multiprocess\pool.py", line 268, in map return self._map_async(func, iterable, mapstar, chunksize).get() File "C:\Users\jeckz\PycharmProjects\pin\venv\lib\site-packages\multiprocess\pool.py", line 657, in get raise self._value
NameError: name 'math' is not defined
The solution I found is to put import math and declare it as a class variable. But I would be happy if someone found a better way.
import math
class parallel():
def __init__(self):
self.math = math
`def foo(self, x):`
`return self.math.asin(x)`
I'm not sure, but this may be a windows
issue (i.e. #65). I tried all the above code on a MacOS, and it works. I do expect that since the errors are similar to those seen in https://github.com/uqfoundation/multiprocess/issues/65 -- and windows has a different forking behavior than does Mac or Linux, it's something related to that. I'll see about doing some follow-up testing on windows. What about if you use dill.settings['recurse'] = True
? This has been seen to workaround related issues on MacOS.
I had the same problem on MacOS and dill.settings['recurse'] = True solved it. Thank you!
Is there any update on this? I have the same problem on Windows and dill_settings['recurse']=True
did not fix it.
I believe this has been solved by https://github.com/uqfoundation/dill/pull/363. Please reopen this issue if that is not the case.
@mmckerns I have the same issue on Window, I feel the issue is not solved yet. Do you want me to re-open ? or create a new issue ?
@amit8121: if you feel the issue isn't solved yet, then please reopen and post the details that you are seeing. If it's determined to be a new issue, then we will move to a new ticket. Please note your versions of dill
, multiprocess
, pathos
, and Python, as well a snippet of code that produces the issue, and the traceback you see. Do note that the patch mentioned above was to dill
, and that patch is not in any of the released versions yet (coming very soon).
Just so that someone having similar issue on Windows knows, it can also be solved by dill.settings['recurse'] = True . Specifically, on Windows, install multiprocess instead of using multiprocessing. Update latest dill in conda
import dill dill.settings['recurse'] = True
Then just as normal: from multiprocess import Pool
@charey6 Thanks. Some questions: 1) Which version of python, dill, and multiprocess are you running? 2) What's wrong/diff with multiprocessing as opposed to multiprocess? 3) github link to multiprocess lib repo?
Also, can someone test and confirm this? Thank you.
@dsanalytics: with regard to (2), multiprocessing
uses pickle
while multiprocess
uses dill
and some very slightly modified Pickler
classes. This enables better object serialization, which includes the storing of dependencies (as in your case above). With regard to (3), https://github.com/uqfoundation/multiprocess, but you can also install it with pip
.
I'm getting name 'testf2' is not defined - screenshots below. I've searched all over and tried many things to no avail. As you can see, this bug makes pathos unusable for any non-trivial processing. Direct call works (commented line) but, as we know, that's not a real-life scenario. Error shows up in both Atom and VSCode and on two completely different machines. Thank you in advance for your help.
Environment: Windows 7 Home Premium 64 bit Windows 10 Home Premium 64 bit Python 3.6 64bit Atom 1.23.2 & VSCode 1.19.2 Anaconda 3 (machine 1), no Anaconda (machine 2) pathos 0.2.1