uqfoundation / pathos

parallel graph management and execution in heterogeneous computing
http://pathos.rtfd.io
Other
1.38k stars 89 forks source link

_pickle.PicklingError #97

Closed mhaseebtariq closed 7 years ago

mhaseebtariq commented 7 years ago

I am getting a PicklingError with a specific implementation. The error can be reproduced as follows:

import pathos

class Squared:
    def calculate(self, number):
        return number**2

def get_squared_class(base_class):
    class NewSquared(base_class):
        pass
    return NewSquared

class Main:
    def __init__(self, squared):
        self.squared = squared

    def call_squared(self, number):
        return self.squared.calculate(number)

    def process(self):
        return multiprocessor(self.call_squared, [2, 3])

def multiprocessor(worker, data):

    results = pathos.multiprocessing.ProcessingPool(ncpus=2).map(worker, data)

    print(results)

sq = get_squared_class(Squared)

m = Main(sq)
m.process()

There is a reason the code is structured in this way. I have tried many work arounds - but none of the workarounds meet the requirement for my use case.

mmckerns commented 7 years ago

Can you check your code? I'm getting a TypeError: unbound method calculate() must be called with NewSquared instance as first argument (got int instance instead) when I try it. Also, let me know which version of python you are using. Please also post your traceback.

mhaseebtariq commented 7 years ago

Hi @mmckerns, thanks for the reply. To answer your questions:

  1. I am using Python 3.5 (Anaconda)
  2. The code didn't reach that point - therefore, I didn't receive the unbound method error. I have updated the code in the description
  3. Here's the complete traceback -
python bug.py 
Traceback (most recent call last):
  File "/Users/haseebtariq/anaconda/lib/python3.5/pickle.py", line 907, in save_global
    obj2, parent = _getattribute(module, name)
  File "/Users/haseebtariq/anaconda/lib/python3.5/pickle.py", line 265, in _getattribute
    .format(name, obj))
AttributeError: Can't get local attribute 'get_squared_class.<locals>.NewSquared' on <function get_squared_class at 0x1003bcf28>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "bug.py", line 35, in <module>
    m.process()
  File "bug.py", line 22, in process
    return multiprocessor(self.call_squared, [2, 3])
  File "bug.py", line 27, in multiprocessor
    results = pathos.multiprocessing.ProcessingPool(ncpus=2).map(worker, data)
  File "/Users/haseebtariq/anaconda/lib/python3.5/site-packages/pathos/multiprocessing.py", line 136, in map
    return _pool.map(star(f), zip(*args)) # chunksize
  File "/Users/haseebtariq/anaconda/lib/python3.5/site-packages/multiprocess/pool.py", line 260, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/Users/haseebtariq/anaconda/lib/python3.5/site-packages/multiprocess/pool.py", line 608, in get
    raise self._value
  File "/Users/haseebtariq/anaconda/lib/python3.5/site-packages/multiprocess/pool.py", line 385, in _handle_tasks
    put(task)
  File "/Users/haseebtariq/anaconda/lib/python3.5/site-packages/multiprocess/connection.py", line 206, in send
    self._send_bytes(ForkingPickler.dumps(obj))
  File "/Users/haseebtariq/anaconda/lib/python3.5/site-packages/multiprocess/reduction.py", line 53, in dumps
    cls(buf, protocol).dump(obj)
  File "/Users/haseebtariq/anaconda/lib/python3.5/pickle.py", line 408, in dump
    self.save(obj)
  File "/Users/haseebtariq/anaconda/lib/python3.5/pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "/Users/haseebtariq/anaconda/lib/python3.5/pickle.py", line 740, in save_tuple
    save(element)
  File "/Users/haseebtariq/anaconda/lib/python3.5/pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "/Users/haseebtariq/anaconda/lib/python3.5/pickle.py", line 725, in save_tuple
    save(element)
  File "/Users/haseebtariq/anaconda/lib/python3.5/pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "/Users/haseebtariq/anaconda/lib/python3.5/pickle.py", line 725, in save_tuple
    save(element)
  File "/Users/haseebtariq/anaconda/lib/python3.5/pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "/Users/haseebtariq/anaconda/lib/python3.5/site-packages/dill/dill.py", line 793, in save_function
    obj.__dict__), obj=obj)
  File "/Users/haseebtariq/anaconda/lib/python3.5/pickle.py", line 599, in save_reduce
    save(args)
  File "/Users/haseebtariq/anaconda/lib/python3.5/pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "/Users/haseebtariq/anaconda/lib/python3.5/pickle.py", line 740, in save_tuple
    save(element)
  File "/Users/haseebtariq/anaconda/lib/python3.5/pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "/Users/haseebtariq/anaconda/lib/python3.5/pickle.py", line 725, in save_tuple
    save(element)
  File "/Users/haseebtariq/anaconda/lib/python3.5/pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "/Users/haseebtariq/anaconda/lib/python3.5/site-packages/dill/dill.py", line 1039, in save_cell
    pickler.save_reduce(_create_cell, (obj.cell_contents,), obj=obj)
  File "/Users/haseebtariq/anaconda/lib/python3.5/pickle.py", line 599, in save_reduce
    save(args)
  File "/Users/haseebtariq/anaconda/lib/python3.5/pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "/Users/haseebtariq/anaconda/lib/python3.5/pickle.py", line 725, in save_tuple
    save(element)
  File "/Users/haseebtariq/anaconda/lib/python3.5/pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "/Users/haseebtariq/anaconda/lib/python3.5/site-packages/dill/dill.py", line 1001, in save_instancemethod0
    pickler.save_reduce(MethodType, (obj.__func__, obj.__self__), obj=obj)
  File "/Users/haseebtariq/anaconda/lib/python3.5/pickle.py", line 599, in save_reduce
    save(args)
  File "/Users/haseebtariq/anaconda/lib/python3.5/pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "/Users/haseebtariq/anaconda/lib/python3.5/pickle.py", line 725, in save_tuple
    save(element)
  File "/Users/haseebtariq/anaconda/lib/python3.5/pickle.py", line 520, in save
    self.save_reduce(obj=obj, *rv)
  File "/Users/haseebtariq/anaconda/lib/python3.5/pickle.py", line 623, in save_reduce
    save(state)
  File "/Users/haseebtariq/anaconda/lib/python3.5/pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "/Users/haseebtariq/anaconda/lib/python3.5/site-packages/dill/dill.py", line 835, in save_module_dict
    StockPickler.save_dict(pickler, obj)
  File "/Users/haseebtariq/anaconda/lib/python3.5/pickle.py", line 810, in save_dict
    self._batch_setitems(obj.items())
  File "/Users/haseebtariq/anaconda/lib/python3.5/pickle.py", line 841, in _batch_setitems
    save(v)
  File "/Users/haseebtariq/anaconda/lib/python3.5/pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "/Users/haseebtariq/anaconda/lib/python3.5/site-packages/dill/dill.py", line 1203, in save_type
    StockPickler.save_global(pickler, obj)
  File "/Users/haseebtariq/anaconda/lib/python3.5/pickle.py", line 911, in save_global
    (obj, module_name, name))
_pickle.PicklingError: Can't pickle <class '__main__.get_squared_class.<locals>.NewSquared'>: it's not found as __main__.get_squared_class.<locals>.NewSquared
mmckerns commented 7 years ago

I made two slight adjustments... add import dill; dill.detect.trace(True) and use ncpus=1. I can reproduce your error in python 3.5. The first time through I was using 2.7. I'm also going to slightly adjust the Pool you are using.

import pathos
import dill

class Squared(object):
  def calculate(number):
    return number**2

def get_squared_class(base_class):
  class NewSquared(base_class):
    pass
  return NewSquared

class Main(object):
  def __init__(self, squared):
    self.squared = squared
  def call_squared(self, number):
    return self.squared.calculate(number)
  def process(self):
    return multiprocessor(self.call_squared, [2,3])

def multiprocessor(worker, data):
  results = pathos.pools.ProcessPool(1).map(worker, data)
  #results = pathos.pools.ParallelPool(1).map(worker, data)
  #results = pathos.pools.ThreadPool(1).map(worker, data)
  print("RESULTS: %s" % results)

sq = get_squared_class(Squared)

if __name__ == '__main__':
  #dill.detect.trace(True)
  #print(dill.copy(sq))
  m = Main(sq)
  #print(dill.copy(m.call_squared))
  m.process()

Here are the results from 2.7:

$ python test_pathos.py 
F2: <function mapstar at 0x107bbbc08>
# F2
F1: <function <lambda> at 0x107c8bb90>
F2: <function _create_function at 0x107b698c0>
# F2
Co: <code object <lambda> at 0x1076ce830, file "/Users/mmckerns/lib/python2.7/site-packages/pathos-0.2.1.dev0-py2.7.egg/pathos/helpers/mp_helper.py", line 15>
T1: <type 'code'>
F2: <function _load_type at 0x107b697d0>
# F2
# T1
# Co
D4: <dict object at 0x107c147f8>
# D4
Ce: <cell at 0x107c70ad0: instancemethod object at 0x107b7d1e0>
F2: <function _create_cell at 0x107b69b90>
# F2
Me: <bound method Main.call_squared of <__main__.Main object at 0x107c2ee90>>
T1: <type 'instancemethod'>
# T1
F1: <function call_squared at 0x107c5f6e0>
Co: <code object call_squared at 0x107612db0, file "test_pathos.py", line 16>
# Co
D1: <dict object at 0x10738c168>
# D1
D2: <dict object at 0x107c955c8>
# D2
# F1
T2: <class '__main__.Main'>
F2: <function _create_type at 0x107b69848>
# F2
T1: <type 'type'>
# T1
T1: <type 'object'>
# T1
D2: <dict object at 0x107c95910>
F1: <function process at 0x107c5f758>
Co: <code object process at 0x1076120b0, file "test_pathos.py", line 18>
# Co
D1: <dict object at 0x10738c168>
# D1
D2: <dict object at 0x107c95b40>
# D2
# F1
F1: <function __init__ at 0x107c5f668>
Co: <code object __init__ at 0x107615cb0, file "test_pathos.py", line 14>
# Co
D1: <dict object at 0x10738c168>
# D1
D2: <dict object at 0x107c95c58>
# D2
# F1
# D2
# T2
D2: <dict object at 0x107c28e88>
T2: <class '__main__.NewSquared'>
T2: <class '__main__.Squared'>
D2: <dict object at 0x107c95e88>
F1: <function calculate at 0x107c5f578>
Co: <code object calculate at 0x107467eb0, file "test_pathos.py", line 5>
# Co
D1: <dict object at 0x10738c168>
# D1
D2: <dict object at 0x107c9f050>
# D2
# F1
# D2
# T2
D2: <dict object at 0x107c95d70>
# D2
# T2
# D2
# Me
# Ce
D2: <dict object at 0x107c954b0>
# D2
# F1
D2: <dict object at 0x107c90e88>
# D2
F2: <function mapstar at 0x107bbbc08>
# F2
F1: <function <lambda> at 0x107c8bb90>
F2: <function _create_function at 0x107b698c0>
# F2
Co: <code object <lambda> at 0x1076ce830, file "/Users/mmckerns/lib/python2.7/site-packages/pathos-0.2.1.dev0-py2.7.egg/pathos/helpers/mp_helper.py", line 15>
T1: <type 'code'>
F2: <function _load_type at 0x107b697d0>
# F2
# T1
# Co
D4: <dict object at 0x107c147f8>
# D4
Ce: <cell at 0x107c70ad0: instancemethod object at 0x107b7d1e0>
F2: <function _create_cell at 0x107b69b90>
# F2
Me: <bound method Main.call_squared of <__main__.Main object at 0x107c2ee90>>
T1: <type 'instancemethod'>
# T1
F1: <function call_squared at 0x107c5f6e0>
Co: <code object call_squared at 0x107612db0, file "test_pathos.py", line 16>
# Co
D1: <dict object at 0x10738c168>
# D1
D2: <dict object at 0x107c955c8>
# D2
# F1
T2: <class '__main__.Main'>
F2: <function _create_type at 0x107b69848>
# F2
T1: <type 'type'>
# T1
T1: <type 'object'>
# T1
D2: <dict object at 0x107c95d70>
F1: <function process at 0x107c5f758>
T4: <type 'exceptions.TypeError'>
Co: <code object process at 0x1076120b0, file "test_pathos.py", line 18>
# T4
# Co
D1: <dict object at 0x10738c168>
# D1
D2: <dict object at 0x107c95b40>
# D2
# F1
F1: <function __init__ at 0x107c5f668>
Traceback (most recent call last):
  File "test_pathos.py", line 30, in <module>
Co: <code object __init__ at 0x107615cb0, file "test_pathos.py", line 14>
    m.process()
  File "test_pathos.py", line 19, in process
    return multiprocessor(self.call_squared, [2,3])
  File "test_pathos.py", line 22, in multiprocessor
# Co
    results = pathos.multiprocessing.ProcessingPool(ncpus=1).map(worker, data)
D1: <dict object at 0x10738c168>
# D1
  File "/Users/mmckerns/lib/python2.7/site-packages/pathos-0.2.1.dev0-py2.7.egg/pathos/multiprocessing.py", line 137, in map
    D2: <dict object at 0x107c95c58>
return _pool.map(star(f), zip(*args)) # chunksize
  File "/Users/mmckerns/lib/python2.7/site-packages/multiprocess-0.70.5.dev0-py2.7-macosx-10.12-x86_64.egg/multiprocess/pool.py", line 251, in map
    return self.map_async(func, iterable, chunksize).get()
  File "/Users/mmckerns/lib/python2.7/site-packages/multiprocess-0.70.5.dev0-py2.7-macosx-10.12-x86_64.egg/multiprocess/pool.py", line 567, in get
    raise self._value
# D2
# F1
# D2
TypeError: # T2
D2: <dict object at 0x107c28e88>
unbound method calculate() must be called with NewSquared instance as first argument (got int instance instead)T2: <class '__main__.NewSquared'>
T2: <class '__main__.Squared'>

D2: <dict object at 0x107c957f8>
F1: <function calculate at 0x107c5f578>
Co: <code object calculate at 0x107467eb0, file "test_pathos.py", line 5>
# Co
D1: <dict object at 0x10738c168>
# D1
D2: <dict object at 0x107c9f050>
# D2
# F1
# D2
# T2
D2: <dict object at 0x107c95398>
# D2
# T2
# D2
# Me
# Ce
D2: <dict object at 0x107c954b0>
# D2
# F1
D2: <dict object at 0x107c95050>
# D2
T4: <type 'exceptions.TypeError'>
# T4

This tells me you have a bug in your code somewhere... you are returning an Exception... it looks something with the __init__ of NewSquared is the issue...?

However, I can produce your error in 3.5:

$ python test_pathos.py 
F2: <function mapstar at 0x104633488>
# F2
F1: <function starargs.<locals>.<lambda> at 0x104b5e9d8>
F2: <function _create_function at 0x104ab1268>
# F2
Co: <code object <lambda> at 0x104609d20, file "/Users/mmckerns/lib/python3.5/site-packages/pathos-0.2.1.dev0-py3.5.egg/pathos/helpers/mp_helper.py", line 15>
T1: <class 'code'>
F2: <function _load_type at 0x104ab1158>
# F2
# T1
# Co
D4: <dict object at 0x104610a88>
# D4
Ce: <cell at 0x1044f4b88: method object at 0x1042f6608>
F2: <function _create_cell at 0x104ab1598>
# F2
Me: <bound method Main.call_squared of <__main__.Main object at 0x10457a7f0>>
T1: <class 'method'>
# T1
F1: <function Main.call_squared at 0x104c26ae8>
Co: <code object call_squared at 0x1045600c0, file "test_pathos.py", line 16>
# Co
D3: <dict object at 0x104330048>
# D3
D2: <dict object at 0x104bb5888>
# D2
# F1
T5: <class '__main__.Main'>
# T5
D2: <dict object at 0x104392348>
T5: <class '__main__.get_squared_class.<locals>.NewSquared'>
F2: <function mapstar at 0x104633488>
# F2
Traceback (most recent call last):
F1: <function starargs.<locals>.<lambda> at 0x104b5e9d8>
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/pickle.py", line 911, in save_global
F2: <function _create_function at 0x104ab1268>
# F2
Co: <code object <lambda> at 0x104609d20, file "/Users/mmckerns/lib/python3.5/site-packages/pathos-0.2.1.dev0-py3.5.egg/pathos/helpers/mp_helper.py", line 15>
T1: <class 'code'>
F2: <function _load_type at 0x104ab1158>
# F2
# T1
# Co
D4: <dict object at 0x104610a88>
# D4
Ce: <cell at 0x1044f4b88: method object at 0x1042f6608>
    obj2, parent = _getattribute(module, name)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/pickle.py", line 265, in _getattribute
F2: <function _create_cell at 0x104ab1598>
# F2
Me: <bound method Main.call_squared of <__main__.Main object at 0x10457a7f0>>
T1: <class 'method'>
# T1
    .format(name, obj))
F1: <function Main.call_squared at 0x104c26ae8>
AttributeError: Can't get local attribute 'get_squared_class.<locals>.NewSquared' on <function get_squared_class at 0x10432f0d0>
Co: <code object call_squared at 0x1045600c0, file "test_pathos.py", line 16>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
# Co
  File "test_pathos.py", line 30, in <module>
D3: <dict object at 0x104330048>
# D3
D2: <dict object at 0x104bb5888>
# D2
# F1
T5: <class '__main__.Main'>
# T5
D2: <dict object at 0x104392348>
T5: <class '__main__.get_squared_class.<locals>.NewSquared'>
    m.process()
  File "test_pathos.py", line 19, in process
    return multiprocessor(self.call_squared, [2,3])
  File "test_pathos.py", line 22, in multiprocessor
    results = pathos.multiprocessing.ProcessingPool(ncpus=1).map(worker, data)
  File "/Users/mmckerns/lib/python3.5/site-packages/pathos-0.2.1.dev0-py3.5.egg/pathos/multiprocessing.py", line 137, in map
    return _pool.map(star(f), zip(*args)) # chunksize
  File "/Users/mmckerns/lib/python3.5/site-packages/multiprocess-0.70.5.dev0-py3.5-macosx-10.12-x86_64.egg/multiprocess/pool.py", line 260, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/Users/mmckerns/lib/python3.5/site-packages/multiprocess-0.70.5.dev0-py3.5-macosx-10.12-x86_64.egg/multiprocess/pool.py", line 608, in get
    raise self._value
  File "/Users/mmckerns/lib/python3.5/site-packages/multiprocess-0.70.5.dev0-py3.5-macosx-10.12-x86_64.egg/multiprocess/pool.py", line 385, in _handle_tasks
    put(task)
  File "/Users/mmckerns/lib/python3.5/site-packages/multiprocess-0.70.5.dev0-py3.5-macosx-10.12-x86_64.egg/multiprocess/connection.py", line 209, in send
    self._send_bytes(ForkingPickler.dumps(obj))
  File "/Users/mmckerns/lib/python3.5/site-packages/multiprocess-0.70.5.dev0-py3.5-macosx-10.12-x86_64.egg/multiprocess/reduction.py", line 53, in dumps
    cls(buf, protocol).dump(obj)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/pickle.py", line 408, in dump
    self.save(obj)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/pickle.py", line 744, in save_tuple
    save(element)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/pickle.py", line 729, in save_tuple
    save(element)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/pickle.py", line 729, in save_tuple
    save(element)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "/Users/mmckerns/lib/python3.5/site-packages/dill-0.2.6.dev0-py3.5.egg/dill/dill.py", line 1306, in save_function
    obj.__dict__), obj=obj)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/pickle.py", line 603, in save_reduce
    save(args)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/pickle.py", line 744, in save_tuple
    save(element)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/pickle.py", line 729, in save_tuple
    save(element)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "/Users/mmckerns/lib/python3.5/site-packages/dill-0.2.6.dev0-py3.5.egg/dill/dill.py", line 1057, in save_cell
    pickler.save_reduce(_create_cell, (obj.cell_contents,), obj=obj)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/pickle.py", line 603, in save_reduce
    save(args)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/pickle.py", line 729, in save_tuple
    save(element)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "/Users/mmckerns/lib/python3.5/site-packages/dill-0.2.6.dev0-py3.5.egg/dill/dill.py", line 1007, in save_instancemethod0
    pickler.save_reduce(MethodType, (obj.__func__, obj.__self__), obj=obj)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/pickle.py", line 603, in save_reduce
    save(args)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/pickle.py", line 729, in save_tuple
    save(element)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/pickle.py", line 520, in save
    self.save_reduce(obj=obj, *rv)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/pickle.py", line 627, in save_reduce
    save(state)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "/Users/mmckerns/lib/python3.5/site-packages/dill-0.2.6.dev0-py3.5.egg/dill/dill.py", line 841, in save_module_dict
    StockPickler.save_dict(pickler, obj)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/pickle.py", line 814, in save_dict
    self._batch_setitems(obj.items())
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/pickle.py", line 845, in _batch_setitems
    save(v)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/pickle.py", line 475, in save
    f(self, obj) # Call unbound method with explicit self
  File "/Users/mmckerns/lib/python3.5/site-packages/dill-0.2.6.dev0-py3.5.egg/dill/dill.py", line 1230, in save_type
    StockPickler.save_global(pickler, obj)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/pickle.py", line 915, in save_global
    (obj, module_name, name))
_pickle.PicklingError: Can't pickle <class '__main__.get_squared_class.<locals>.NewSquared'>: it's not found as __main__.get_squared_class.<locals>.NewSquared

It looks like the NewSquaredobject is providing the pickling problem. It's basically saying that the copy of the class that is being produced is not the same as the class it's looking for. This is a dill pickling issue, that I think is solved -- as you can see if you do some uncommenting and just test dill.copy(sq). It works, so the object can be pickled. There's probably some leftover dill.settings logic that remains in pathos.multiprocessing that needs some cleaning. I'll have to track that down. Hmm.

I've provided some alternates, using different pools, which use different backends (and thus different pickling)... and both give the expected answer. So you could use one of those instead, at least for the time being.

I think this also might push me to expose the dill.settings API to pathos, to be able to better utilize some of the serialization variants. It's a pretty common pickling error that one can workaround with either changing dill.settings (which needs work to expose it in pathos) or by using one of the alternatepathos.pools.

Is that a sufficient workaround for you? If not, I think I'd need to see where pathos might need a bug-fix (i.e. it'll take some time, and I might not do it immediately). I think this is actually a bug, but let me know if the workarounds are sufficient. If so, you can go forward while I track down the bug.

mhaseebtariq commented 7 years ago

Thanks! This solved the error on the example file (bug.py). But now when I try these other pooling options on my actual code - I can only get it to work withThreadPool. But with thread pools the multi cores are not utilized and my processor is really process intensive. When I try ProcessPool or ParallelPool, I start getting all kinds of different errors - MaximumRecursion _pickle.PicklingError. I will try to replicate the same error for the example I shared in this issue and then get back to you.

mhaseebtariq commented 7 years ago

Hi @mmckerns I have one small unrelated question - If you have time please. Sometimes if there's too much data to be passed between the workers - the threads just hang forever. I think it is related to this issue http://stackoverflow.com/questions/21641887/python-multiprocessing-process-hangs-on-join-for-large-queue

Would you happen to know any solution/workaround for this?

mmckerns commented 7 years ago

I'd have to agree with the answers provided in the link you posted. One of the reasons I made pathos was the ability to do hierarchical parallel maps, and that can probably avoid the deadlock issue (from overloaded queues) if you can break your data into chunks and then process those chunks in parallel. Typically, that means a ThreadPool and a ProcessPool working together.

mhaseebtariq commented 7 years ago

Cool. Thank you very much for the suggestion! :)

mmckerns commented 7 years ago

Let me know if you are past this issue, and it can be closed.

mhaseebtariq commented 7 years ago

I was not able to restructure the example code (mentioned in the first comment) to reproduce the bug - after implementing the workaround you suggested. For my actual code - I restructured it in a way that I don't have to create a new class definition inside a function. So now I don't have this issue any more. I guess this issue can be closed then. Thank you so much for your time! :)

mmckerns commented 7 years ago

Ok, thanks. I've opened a new ticket to enable dill.settings to be exposed in pathos, based on the discussion in this issue (see: #99).