Closed justinlovinger closed 4 years ago
Further testing reveals the issue is with dill
import pickle
import dill
import multiprocessing
def foo(x):
return x
def bar(x):
return foo(x)
def undill_run(dill_func, arg):
return dill.loads(dill_func)(arg)
if __name__ == '__main__':
pool = multiprocessing.Pool(processes=2)
print pool.map(functools.partial(undill_run, dill.dumps(bar)), [0, 1])
Returns the same error:
NameError: global name 'foo' is not defined
Error does not occur when pickle is used in place of dill (even if dill is imported).
This is not only the case for defined functions
but also for imported modules
. Due to this bug I had to switch back to standard library multiprocessing + dill.
@JustinLovinger @DavidLP: I'm not seeing this error on MacOS. However, do I see that at least @JustinLovinger is using Windows. On Windows, you are missing freeze_support
-- which is required on window to be able to run from __main__
. It's not a bug, it's a requirement inherited from multiprocessing
.
Try:
if __name__ == '__main__':
pathos.helpers.freeze_support()
pool = pathos.multiprocessing.Pool(processes=2)
print(pool.map(bar, [0, 1]))
Let me know if this fixes your code, (and close the ticket)... or if it doesn't please let me know what you are seeing that's different.
Using pathos.helpers.freeze_support()
added to the axample of @JustinLovinger gives still the same exception.
I use Windows by the way. dill
for pickling methods, modules + std. library multiprocessing is working for me.
@DavidLP: I've tried it on windows and I am not seeing any problems. So, let's figure out what in your environment is triggering this. Can you provide me your code and traceback, as well as python and system information?
pathos.helpers.freeze_support()
doesn't do the trick for me neither for the Pickle thingy (#125). Did update to python 2.7.14 via anacanoda a few days ago. No net positive result. Perhaps these issues are related? Not sure how back_end debugging works but I did that for matplotlib using cairo as suggested elsewhere and found the bug in three or four clicks.. Any suggestion if this works here?
I have the same issue on Windows 10 using the same code as OP. Any suggestions on how to fix it?
#pathostest.py
import pathos
def foo(x):
return x
def bar(x):
return foo(x)
if __name__ == '__main__':
pool = pathos.multiprocessing.Pool(processes=2)
print pool.map(bar, [0, 1])
Running the above code gives me the same error
Traceback (most recent call last):
File "pathostest.py", line 12, in <module>
print pool.map(bar, [0, 1])
File "C:\Python27\lib\site-packages\multiprocess\pool.py", line 253, in map
return self.map_async(func, iterable, chunksize).get()
File "C:\Python27\lib\site-packages\multiprocess\pool.py", line 572, in get
raise self._value
NameError: global name 'foo' is not defined
@SjurdurS: It's a common error on windows, and it overwhelmingly comes from one of two things:
pathos.helpers.freeze_support()
in __main__
, and/ormultiprocess
(it needs a C compiler)Sorry this issue seems to be stale for everyone... I'm at an impasse as I can produce the error by doing either of the two things enumerated above, but can't reproduce the error if both of the above are resolved. Not sure what to do here. If there's no other comments that can elucidate what is different about your environment, and why you are all seeing errors when using freeze_support
and have a correctly built multiprocess
... then I'll assume that the issue either won't fix or is a voided issue.
It works if you put the functions as methods:
class parallel():
def foo(self, x):
return x
def bar(self, x):
return self.foo(x)
if __name__ == '__main__':
p = parallel()
pool = pathos.multiprocessing.Pool(processes=2)
print(pool.map(p.bar, [0, 1]))
C:\Users\jeckz\PycharmProjects\pin\venv\Scripts\python.exe C:/Users/jeckz/PycharmProjects/pin/venv/pathostest.py
[0, 1]
Process finished with exit code 0
However, if you try to import a package which you want to use within one of the methods for instances:
import pathos
import math
class parallel():
def foo(self, x):
return math.asin(x)
def bar(self, x):
return self.foo(x)
if __name__ == '__main__':
p = parallel()
pool = pathos.multiprocessing.Pool(processes=2)
print(pool.map(p.bar, [0, 1]))
I receive:
The above exception was the direct cause of the following exception: Traceback (most recent call last): File "C:/Users/jeckz/PycharmProjects/pin/venv/pathostest.py", line 14, in <module> print(pool.map(p.bar, [0, 1])) File "C:\Users\jeckz\PycharmProjects\pin\venv\lib\site-packages\multiprocess\pool.py", line 268, in map return self._map_async(func, iterable, mapstar, chunksize).get() File "C:\Users\jeckz\PycharmProjects\pin\venv\lib\site-packages\multiprocess\pool.py", line 657, in get raise self._value
NameError: name 'math' is not defined
The solution I found is to put import math and declare it as a class variable. But I would be happy if someone found a better way.
import math
class parallel():
def __init__(self):
self.math = math
`def foo(self, x):`
`return self.math.asin(x)`
I can reproduce this problem when I install only multiprocess
and not the pathos
framework and I try to use the Process
class by myself.
My configuration is:
Test example:
# test_example_multiprocess.py
def func1():
print("Hello world!")
def func2():
func1()
if __name__ == "__main__":
from multiprocess import Process
proc = Process(target=func2)
proc.start()
proc.join()
and this is the traceback:
(py37) D:\vic\Desktop>python test_example_multiprocess.py
Process Process-1:
Traceback (most recent call last):
File "C:\GNU\Anaconda3\envs\py37\lib\site-packages\multiprocess\process.py",
line 297, in _bootstrap
self.run()
File "C:\GNU\Anaconda3\envs\py37\lib\site-packages\multiprocess\process.py",
line 99, in run
self._target(*self._args, **self._kwargs)
File "test_example_multiprocess.py", line 9, in func2
func1()
NameError: name 'func1' is not defined
(py37) D:\vic\Desktop>
The same snippet works when I use the standard multiprocessing
library:
# test_example_multiprocessing.py
def func1():
print("Hello world!")
def func2():
func1()
if __name__ == "__main__":
from multiprocessing import Process
proc = Process(target=func2)
proc.start()
proc.join()
(py37) D:\vic\Desktop>python test_example_multiprocessing.py
Hello world!
The same problem occurs with another environment using Python 3.6.10 and multiprocess 0.70.9, and it does not occur with Python 3.5.6 and multiprocess 0.70.5. So it seems something occurred between these two versions that introduced this issue.
Edit: I was diving a bit more and in fact the multiprocess version is not the problem, it comes from dill. For Python 3.6.10 and Python 3.7.6, multiprocess 0.70.9 works with dill 0.2.8.2 and has the bug with dill 0.2.9 or newer. I have the feeling this is dill issue #323.
I'm going to label this a bug, even though it was a dill
bug. Should be fixed due to https://github.com/uqfoundation/dill/issues/363
@mmckerns Dear mmckerns, I don't understand why this post is closed. Because the issue is not solved! I installed the latest pathos on windows 10. If you run JustinLovinger's code in interactive jupyterlab mode, you still got "NameError: name 'foo' is not defined". But it is indeed working in scripting mode. I suggest reopen this post and solve this annoying bug completely. Thank you very much.
@simonnier: Did you run the code in Jupyter in a single cell or in multiple cells? Jupyter messes with the structure of the global namespace (each cell is in its own local namespace), and also messes with parallelism. This issue is closed because it was for the behavior in python in general. Working across multiple cells in Jupyter, for example, should only work under certain circumstances, but isn't guaranteed for this or several other features of pathos
and dill
. Feel free to open a new ticket, specifically requesting this feature in a notebook.
@mmckerns I just opened a new issue https://github.com/uqfoundation/pathos/issues/219 , hope that multiprocess will solve all the problems in jupyterlab on windows. :)
Running this script on Windows 10 results in:
The same occurs with pathos.multiprocessing.ProcessPool. pathos.parallel.ParallelPool does not have this issue.