uqfoundation / pathos

parallel graph management and execution in heterogeneous computing
http://pathos.rtfd.io
Other
1.38k stars 89 forks source link

pathos.multiprocessing._ProcessPool: Can't pickle <class 'ellipsis'> #218

Closed kennedyjosh closed 1 year ago

kennedyjosh commented 3 years ago

Code

from pathos.multiprocessing import _ProcessPool as Pool

...

def deploy_command(region, machine_ip, branch):
    ...

...

# args.region is a string, machines is a list of strings, branch is a string
call_args = zip([args.region] * len(machines), machines, [branch] * len(machines))
## print(list(call_args)) -> no ellipses visible in here
with Pool(processes=10) as pool:
    pool.starmap(deploy_command, call_args)     # fails here
    pool.close()
    pool.join()

Error & Stacktrace

Traceback (most recent call last):
  File "/Users/josh/Code/server/hapy/h.py", line 37, in <module>
    main()
  File "/Users/josh/Code/server/hapy/h.py", line 33, in main
    args.func(args, parser)
  File "/Users/josh/Code/server/hapy/commands/deploy.py", line 64, in deploy
    pool.starmap(deploy_command, call_args)
  File "/Users/josh/Code/server/hapy/venv/lib/python3.9/site-packages/multiprocess/pool.py", line 372, in starmap
    return self._map_async(func, iterable, starmapstar, chunksize).get()
  File "/Users/josh/Code/server/hapy/venv/lib/python3.9/site-packages/multiprocess/pool.py", line 771, in get
    raise self._value
  File "/Users/josh/Code/server/hapy/venv/lib/python3.9/site-packages/multiprocess/pool.py", line 537, in _handle_tasks
    put(task)
  File "/Users/josh/Code/server/hapy/venv/lib/python3.9/site-packages/multiprocess/connection.py", line 214, in send
    self._send_bytes(_ForkingPickler.dumps(obj))
  File "/Users/josh/Code/server/hapy/venv/lib/python3.9/site-packages/multiprocess/reduction.py", line 54, in dumps
    cls(buf, protocol, *args, **kwds).dump(obj)
  File "/Users/josh/Code/server/hapy/venv/lib/python3.9/site-packages/dill/_dill.py", line 498, in dump
    StockPickler.dump(self, obj)
  File "/usr/local/Cellar/python@3.9/3.9.6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/pickle.py", line 487, in dump
    self.save(obj)
  File "/usr/local/Cellar/python@3.9/3.9.6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/usr/local/Cellar/python@3.9/3.9.6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/pickle.py", line 901, in save_tuple
    save(element)
  File "/usr/local/Cellar/python@3.9/3.9.6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/usr/local/Cellar/python@3.9/3.9.6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/pickle.py", line 886, in save_tuple
    save(element)
  File "/usr/local/Cellar/python@3.9/3.9.6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/usr/local/Cellar/python@3.9/3.9.6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/pickle.py", line 886, in save_tuple
    save(element)
  File "/usr/local/Cellar/python@3.9/3.9.6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/Users/josh/Code/server/hapy/venv/lib/python3.9/site-packages/dill/_dill.py", line 1493, in save_function
    pickler.save_reduce(_create_function, (obj.__code__,
  File "/usr/local/Cellar/python@3.9/3.9.6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/pickle.py", line 692, in save_reduce
    save(args)
  File "/usr/local/Cellar/python@3.9/3.9.6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/usr/local/Cellar/python@3.9/3.9.6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/pickle.py", line 901, in save_tuple
    save(element)
  File "/usr/local/Cellar/python@3.9/3.9.6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/Users/josh/Code/server/hapy/venv/lib/python3.9/site-packages/dill/_dill.py", line 990, in save_module_dict
    StockPickler.save_dict(pickler, obj)
  File "/usr/local/Cellar/python@3.9/3.9.6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/pickle.py", line 971, in save_dict
    self._batch_setitems(obj.items())
  File "/usr/local/Cellar/python@3.9/3.9.6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/pickle.py", line 997, in _batch_setitems
    save(v)
  File "/usr/local/Cellar/python@3.9/3.9.6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/Users/josh/Code/server/hapy/venv/lib/python3.9/site-packages/dill/_dill.py", line 1375, in save_module
    pickler.save_reduce(_import_module, (obj.__name__,), obj=obj,
  File "/usr/local/Cellar/python@3.9/3.9.6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/pickle.py", line 717, in save_reduce
    save(state)
  File "/usr/local/Cellar/python@3.9/3.9.6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/Users/josh/Code/server/hapy/venv/lib/python3.9/site-packages/dill/_dill.py", line 990, in save_module_dict
    StockPickler.save_dict(pickler, obj)
  File "/usr/local/Cellar/python@3.9/3.9.6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/pickle.py", line 971, in save_dict
    self._batch_setitems(obj.items())
  File "/usr/local/Cellar/python@3.9/3.9.6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/pickle.py", line 997, in _batch_setitems
    save(v)
  File "/usr/local/Cellar/python@3.9/3.9.6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/Users/josh/Code/server/hapy/venv/lib/python3.9/site-packages/dill/_dill.py", line 1375, in save_module
    pickler.save_reduce(_import_module, (obj.__name__,), obj=obj,
  File "/usr/local/Cellar/python@3.9/3.9.6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/pickle.py", line 717, in save_reduce
    save(state)
  File "/usr/local/Cellar/python@3.9/3.9.6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/Users/josh/Code/server/hapy/venv/lib/python3.9/site-packages/dill/_dill.py", line 990, in save_module_dict
    StockPickler.save_dict(pickler, obj)
  File "/usr/local/Cellar/python@3.9/3.9.6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/pickle.py", line 971, in save_dict
    self._batch_setitems(obj.items())
  File "/usr/local/Cellar/python@3.9/3.9.6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/pickle.py", line 997, in _batch_setitems
    save(v)
  File "/usr/local/Cellar/python@3.9/3.9.6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/Users/josh/Code/server/hapy/venv/lib/python3.9/site-packages/dill/_dill.py", line 1493, in save_function
    pickler.save_reduce(_create_function, (obj.__code__,
  File "/usr/local/Cellar/python@3.9/3.9.6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/pickle.py", line 692, in save_reduce
    save(args)
  File "/usr/local/Cellar/python@3.9/3.9.6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/usr/local/Cellar/python@3.9/3.9.6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/pickle.py", line 901, in save_tuple
    save(element)
  File "/usr/local/Cellar/python@3.9/3.9.6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/Users/josh/Code/server/hapy/venv/lib/python3.9/site-packages/dill/_dill.py", line 990, in save_module_dict
    StockPickler.save_dict(pickler, obj)
  File "/usr/local/Cellar/python@3.9/3.9.6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/pickle.py", line 971, in save_dict
    self._batch_setitems(obj.items())
  File "/usr/local/Cellar/python@3.9/3.9.6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/pickle.py", line 997, in _batch_setitems
    save(v)
  File "/usr/local/Cellar/python@3.9/3.9.6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/Users/josh/Code/server/hapy/venv/lib/python3.9/site-packages/dill/_dill.py", line 1439, in save_type
    StockPickler.save_global(pickler, obj, name=name)
  File "/usr/local/Cellar/python@3.9/3.9.6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/pickle.py", line 1070, in save_global
    raise PicklingError(
_pickle.PicklingError: Can't pickle <class 'ellipsis'>: it's not found as builtins.ellipsis

Version Info

Python version: 3.9.6 pathos version: 0.2.8

mmckerns commented 3 years ago

Can you provide a minimal self-contained example? I tried a simple example, and I'm not seeing an issue yet.

Python 3.9.6 (default, Jul  3 2021, 08:33:50) 
[Clang 9.0.0 (clang-900.0.39.2)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> 
>>> import pathos as pa
>>> pa.multiprocessing._ProcessPool
<class 'multiprocess.pool.Pool'>
>>> 
>>> p = pa.multiprocessing._ProcessPool()
>>> p.map(lambda x: Ellipsis, [Ellipsis]*4)
[Ellipsis, Ellipsis, Ellipsis, Ellipsis]
>>> pa.__version__
'0.2.9.dev0'
>>> 

Also, from your traceback, I see dill is punting to pickle, which is failing. dill should handle the Ellipsis itself, so that's a bit odd.

>>> import dill
>>> dill.__version__
'0.3.5.dev0'
>>> dill.dumps(Ellipsis)
b'\x80\x04\x95,\x00\x00\x00\x00\x00\x00\x00\x8c\ndill._dill\x94\x8c\n_eval_repr\x94\x93\x94\x8c\x08Ellipsis\x94\x85\x94R\x94.'
>>> dill.loads(_)
Ellipsis
>>> 
kennedyjosh commented 3 years ago

I think I might have an intuition about why it isn't working... the python module with this code is being dynamically loaded by another script, like so:

def main():

    # ...

    all_files = os.listdir(srcpath)
    for filename in sorted(all_files):
        if filename.endswith(".py"):
            filepath = os.path.join(srcpath, filename)
            spec = imp.spec_from_file_location(filename, filepath)
            mod = imp.module_from_spec(spec)
            spec.loader.exec_module(mod)
            # if a module doesn't have this function, script should crash
            mod.some_init_fun()

    # use argparse to parse command-line args and call the appropriate function
    args = parser.parser_args
    args.func(args, parser)

I'm using argparse, and some_init_fun initializes a subparser (one of which contains the problematic code from this issue). If I take the code I posted here and put it into its own file, it works fine. Here is a minimal example:

from pathos.multiprocessing import _ProcessPool as Pool

def deploy_command(region, machine_ip, branch):
    print(f"called as: {region}, {machine_ip}, {branch}")

def main():
    pool = Pool(processes=10)
    res = pool.starmap(deploy_command,
                 [["us-east-1", "prod2", "master"],
                 ["us-east-1", "prod3", "master"],
                 ["us-east-1", "prod4", "master"]]
                 )
    pool.close()
    pool.join()

if __name__ == "__main__":
    main()

Basically, when this code is called from its own file, it works. When it is called from another file as a dynamically loaded module, it doesn't work. Any ideas on how to solve this without putting everything in the same file?

kennedyjosh commented 3 years ago

Also, from your traceback, I see dill is punting to pickle, which is failing. dill should handle the Ellipsis itself, so that's a bit odd.

Yes – I noticed this. The whole reason I'm using pathos is because I had a (different) pickling error with the builtin multiprocessing library. Strange that it solved that but not this

mmckerns commented 3 years ago

Two things: (1) how are you dynamically loading the module from another file (code please), and (2) if you are executing anything from the shell (i.e. outside of python, in another process) are you sure that the version of python and the pythonpath is the same as in main? For example if you have multiple python installations that depends on a shell script to set the path, when you spawn a new process, you need to make sure that you are using the same python and the same path as in main.

kennedyjosh commented 3 years ago

I posted the code above, but I'll repost it here:

# dynamically load all files in srcpath and call their top-level some_init_fun method
all_files = os.listdir(srcpath)
for filename in sorted(all_files):
    if filename.endswith(".py"):
        filepath = os.path.join(srcpath, filename)
        spec = imp.spec_from_file_location(filename, filepath)
        mod = imp.module_from_spec(spec)
        spec.loader.exec_module(mod)
        # if a module doesn't have this function, script should crash
        mod.some_init_fun()

The function some_init_fun initializes and populates an ArgumentParser. After this for loop, the parent ArgumentParser is called to parse the command line arguments and call the correct function.

# use argparse to parse command-line args and call the appropriate function
args = parser.parser_args     # parsers command-line args into variables
args.func(args, parser)     # calls function from some other file

So, for the code I posted in my previous comment, the user enter some command-line command with options, and this gets passed down to the main function which uses the Pool to call deploy_command.

===

are you sure that the version of python and the pythonpath is the same as in main?

Good question – I inserted a print(sys.version) and print(pathos.__version__) right before where the error is occurring and verified that the version is the same as I posted in my first comment (Python 3.9.6, pathos 0.2.8)

mmckerns commented 3 years ago

Dynamically loaded modules are not guaranteed to work. What I meant with "code please" is that I can't just run your example and see if it errors or not... I'd first have to write the module you're dynamically loading, right? Where is some_init_fun defined in your example?... and if it's not important, then maybe remove it...

mmckerns commented 1 year ago

I'm not able to repeat the errors you are seeing, and this issue appears a bit stale. I'm going to close this, but please feel free to reopen this if you have more input.