uqfoundation / pathos

parallel graph management and execution in heterogeneous computing
http://pathos.rtfd.io
Other
1.38k stars 89 forks source link

Cannot pickle error #224

Closed mikygit closed 2 years ago

mikygit commented 2 years ago

Hello, I'm getting this pickle error when trying to run the following code. nb: I'm using python3.9, import _multiprocess works fine.

Any ideas? Thx.

from pathos.multiprocessing import ProcessingPool as Pool
from envlogger import reader

def process_episode(e):
    return len(e)

if __name__ == "__main__":
    pool = Pool(1)
    with reader.Reader(data_directory="/xxx/project/vdt/tests") as r:
        results = pool.map(process_line, r.episodes[:1])

    print(results)

Traceback (most recent call last): File "/home/toto/project/./tests/multiproc_tests", line 28, in results = pool.map(process_line, r.episodes[:1]) File "/home/anaconda3/envs/vdt/lib/python3.9/site-packages/pathos/multiprocessing.py", line 139, in map return _pool.map(star(f), zip(args)) # chunksize File "/home/anaconda3/envs/vdt/lib/python3.9/site-packages/multiprocess/pool.py", line 364, in map return self._map_async(func, iterable, mapstar, chunksize).get() File "/home/anaconda3/envs/vdt/lib/python3.9/site-packages/multiprocess/pool.py", line 771, in get raise self._value File "/home/anaconda3/envs/vdt/lib/python3.9/site-packages/multiprocess/pool.py", line 537, in _handle_tasks put(task) File "/home/anaconda3/envs/vdt/lib/python3.9/site-packages/multiprocess/connection.py", line 214, in send self._send_bytes(_ForkingPickler.dumps(obj)) File "/home/anaconda3/envs/vdt/lib/python3.9/site-packages/multiprocess/reduction.py", line 54, in dumps cls(buf, protocol, args, *kwds).dump(obj) File "/home/anaconda3/envs/vdt/lib/python3.9/site-packages/dill/_dill.py", line 498, in dump StockPickler.dump(self, obj) File "/home/anaconda3/envs/vdt/lib/python3.9/pickle.py", line 485, in dump self.save(obj) File "/home/anaconda3/envs/vdt/lib/python3.9/pickle.py", line 558, in save f(self, obj) # Call unbound method with explicit self File "/home/anaconda3/envs/vdt/lib/python3.9/pickle.py", line 899, in save_tuple save(element) File "/home/anaconda3/envs/vdt/lib/python3.9/pickle.py", line 558, in save f(self, obj) # Call unbound method with explicit self File "/home/anaconda3/envs/vdt/lib/python3.9/pickle.py", line 884, in save_tuple save(element) File "/home/anaconda3/envs/vdt/lib/python3.9/pickle.py", line 558, in save f(self, obj) # Call unbound method with explicit self File "/home/anaconda3/envs/vdt/lib/python3.9/pickle.py", line 884, in save_tuple save(element) File "/home/anaconda3/envs/vdt/lib/python3.9/pickle.py", line 558, in save f(self, obj) # Call unbound method with explicit self File "/home/anaconda3/envs/vdt/lib/python3.9/pickle.py", line 884, in save_tuple save(element) File "/home/anaconda3/envs/vdt/lib/python3.9/pickle.py", line 558, in save f(self, obj) # Call unbound method with explicit self File "/home/anaconda3/envs/vdt/lib/python3.9/pickle.py", line 884, in save_tuple save(element) File "/home/anaconda3/envs/vdt/lib/python3.9/pickle.py", line 601, in save self.save_reduce(obj=obj, rv) File "/home/anaconda3/envs/vdt/lib/python3.9/pickle.py", line 715, in save_reduce save(state) File "/home/anaconda3/envs/vdt/lib/python3.9/pickle.py", line 558, in save f(self, obj) # Call unbound method with explicit self File "/home/anaconda3/envs/vdt/lib/python3.9/site-packages/dill/_dill.py", line 990, in save_module_dict StockPickler.save_dict(pickler, obj) File "/home/anaconda3/envs/vdt/lib/python3.9/pickle.py", line 969, in save_dict self._batch_setitems(obj.items()) File "/home/anaconda3/envs/vdt/lib/python3.9/pickle.py", line 995, in _batch_setitems save(v) File "/home/anaconda3/envs/vdt/lib/python3.9/pickle.py", line 558, in save f(self, obj) # Call unbound method with explicit self File "/home/anaconda3/envs/vdt/lib/python3.9/site-packages/dill/_dill.py", line 1493, in save_function pickler.save_reduce(_create_function, (obj.code, File "/home/anaconda3/envs/vdt/lib/python3.9/pickle.py", line 690, in save_reduce save(args) File "/home/anaconda3/envs/vdt/lib/python3.9/pickle.py", line 558, in save f(self, obj) # Call unbound method with explicit self File "/home/anaconda3/envs/vdt/lib/python3.9/pickle.py", line 899, in save_tuple save(element) File "/home/anaconda3/envs/vdt/lib/python3.9/pickle.py", line 558, in save f(self, obj) # Call unbound method with explicit self File "/home/anaconda3/envs/vdt/lib/python3.9/pickle.py", line 884, in save_tuple save(element) File "/home/anaconda3/envs/vdt/lib/python3.9/pickle.py", line 558, in save f(self, obj) # Call unbound method with explicit self File "/home/anaconda3/envs/vdt/lib/python3.9/site-packages/dill/_dill.py", line 1227, in save_cell pickler.save_reduce(_create_cell, (f,), obj=obj) File "/home/anaconda3/envs/vdt/lib/python3.9/pickle.py", line 690, in save_reduce save(args) File "/home/anaconda3/envs/vdt/lib/python3.9/pickle.py", line 558, in save f(self, obj) # Call unbound method with explicit self File "/home/anaconda3/envs/vdt/lib/python3.9/pickle.py", line 884, in save_tuple save(element) File "/home/anaconda3/envs/vdt/lib/python3.9/pickle.py", line 576, in save rv = reduce(self.proto) TypeError: cannot pickle 'envlogger.backends.python.episode_info.EpisodeInfo' object

packages: _libgcc_mutex 0.1 conda_forge conda-forge _openmp_mutex 4.5 1_gnu conda-forge absl-py 1.0.0 pypi_0 pypi apache-beam 2.34.0 pypi_0 pypi astunparse 1.6.3 pypi_0 pypi attrs 21.2.0 pypi_0 pypi avro-python3 1.9.2.1 pypi_0 pypi bzip2 1.0.8 h7f98852_4 conda-forge ca-certificates 2021.10.8 ha878542_0 conda-forge cachetools 4.2.4 pypi_0 pypi certifi 2021.10.8 pypi_0 pypi cffi 1.15.0 pypi_0 pypi charset-normalizer 2.0.9 pypi_0 pypi click 8.0.3 pypi_0 pypi cloudpickle 2.0.0 pypi_0 pypi colorama 0.4.4 pypi_0 pypi crcmod 1.7 pypi_0 pypi cycler 0.11.0 pypi_0 pypi cython 0.29.25 pypi_0 pypi d3rlpy 0.91 pypi_0 pypi deprecated 1.2.13 pypi_0 pypi dill 0.3.4 pypi_0 pypi dm-env 1.5 pypi_0 pypi dm-reverb-nightly 0.7.0.dev20211211 pypi_0 pypi dm-tree 0.1.6 pypi_0 pypi docopt 0.6.2 pypi_0 pypi envlogger 1.0.5 pypi_0 pypi fastavro 1.4.7 pypi_0 pypi fasteners 0.15 pypi_0 pypi filelock 3.4.0 pypi_0 pypi flatbuffers 2.0 pypi_0 pypi fonttools 4.28.3 pypi_0 pypi free-mujoco-py 2.1.6 pypi_0 pypi future 0.18.2 pypi_0 pypi gast 0.4.0 pypi_0 pypi glfw 1.12.0 pypi_0 pypi google-auth 2.3.3 pypi_0 pypi google-auth-oauthlib 0.4.6 pypi_0 pypi google-pasta 0.2.0 pypi_0 pypi gputil 1.4.0 pypi_0 pypi grpcio 1.42.0 pypi_0 pypi gym 0.21.0 pypi_0 pypi h5py 3.6.0 pypi_0 pypi hdfs 2.6.0 pypi_0 pypi httplib2 0.19.1 pypi_0 pypi idna 3.3 pypi_0 pypi imageio 2.13.3 pypi_0 pypi importlib-metadata 4.8.2 pypi_0 pypi joblib 1.1.0 pypi_0 pypi jsonschema 4.2.1 pypi_0 pypi keras-nightly 2.8.0.dev2021121108 pypi_0 pypi keras-preprocessing 1.1.2 pypi_0 pypi kiwisolver 1.3.2 pypi_0 pypi ld_impl_linux-64 2.36.1 hea4e1c9_2 conda-forge libblas 3.9.0 12_linux64_openblas conda-forge libcblas 3.9.0 12_linux64_openblas conda-forge libclang 12.0.0 pypi_0 pypi libffi 3.3 h58526e2_2 conda-forge libgcc-ng 11.2.0 h1d223b6_11 conda-forge libgfortran-ng 11.2.0 h69a702a_11 conda-forge libgfortran5 11.2.0 h5c6108e_11 conda-forge libgomp 11.2.0 h1d223b6_11 conda-forge liblapack 3.9.0 12_linux64_openblas conda-forge libnsl 2.0.0 h7f98852_0 conda-forge libopenblas 0.3.18 pthreads_h8fe5266_0 conda-forge libstdcxx-ng 11.2.0 he4da1e4_11 conda-forge libuuid 2.32.1 h7f98852_1000 conda-forge libzlib 1.2.11 h36c2ea0_1013 conda-forge markdown 3.3.6 pypi_0 pypi matplotlib 3.5.1 pypi_0 pypi mock 4.0.3 pypi_0 pypi monotonic 1.6 pypi_0 pypi msgpack 1.0.3 pypi_0 pypi multiprocess 0.70.12.2 pypi_0 pypi ncurses 6.2 h58526e2_4 conda-forge numpy 1.20.3 pypi_0 pypi oauth2client 4.1.3 pypi_0 pypi oauthlib 3.1.1 pypi_0 pypi openssl 1.1.1l h7f98852_0 conda-forge opt-einsum 3.3.0 pypi_0 pypi orjson 3.6.5 pypi_0 pypi packaging 21.3 pypi_0 pypi pathos 0.2.8 pypi_0 pypi pillow 8.4.0 pypi_0 pypi pip 21.3.1 pyhd8ed1ab_0 conda-forge portpicker 1.5.0 pypi_0 pypi pox 0.3.0 pypi_0 pypi ppft 1.6.6.4 pypi_0 pypi protobuf 3.19.1 pypi_0 pypi psutil 5.8.0 pypi_0 pypi pyarrow 5.0.0 pypi_0 pypi pyasn1 0.4.8 pypi_0 pypi pyasn1-modules 0.2.8 pypi_0 pypi pycparser 2.21 pypi_0 pypi pydot 1.4.2 pypi_0 pypi pymongo 3.12.3 pypi_0 pypi pyparsing 2.4.7 pypi_0 pypi pyrsistent 0.18.0 pypi_0 pypi python 3.9.0 hffdb5ce_5_cpython conda-forge python-dateutil 2.8.2 pypi_0 pypi python-interface 1.6.1 pypi_0 pypi python_abi 3.9 2_cp39 conda-forge pytz 2021.3 pypi_0 pypi pyyaml 6.0 pypi_0 pypi ray 1.9.0 pypi_0 pypi readline 8.1 h46c0cb4_0 conda-forge redis 4.0.2 pypi_0 pypi requests 2.26.0 pypi_0 pypi requests-oauthlib 1.3.0 pypi_0 pypi rsa 4.8 pypi_0 pypi scikit-learn 1.0.1 pypi_0 pypi scipy 1.7.3 pypi_0 pypi setuptools 59.6.0 py39hf3d152e_0 conda-forge six 1.16.0 pypi_0 pypi sqlite 3.37.0 h9cd32fc_0 conda-forge structlog 21.4.0 pypi_0 pypi tb-nightly 2.8.0a20211209 pypi_0 pypi tensorboard-data-server 0.6.1 pypi_0 pypi tensorboard-plugin-wit 1.8.0 pypi_0 pypi tensorboardx 2.4.1 pypi_0 pypi tensorflow-io-gcs-filesystem 0.22.0 pypi_0 pypi termcolor 1.1.0 pypi_0 pypi tf-estimator-nightly 2.8.0.dev2021121109 pypi_0 pypi tf-nightly-gpu 2.8.0.dev20211211 pypi_0 pypi threadpoolctl 3.0.0 pypi_0 pypi tk 8.6.11 h27826a3_1 conda-forge torch 1.10.0 pypi_0 pypi tqdm 4.62.3 pypi_0 pypi typing-extensions 3.10.0.2 pypi_0 pypi tzdata 2021e he74cb21_0 conda-forge urllib3 1.26.7 pypi_0 pypi werkzeug 2.0.2 pypi_0 pypi wheel 0.37.0 pyhd8ed1ab_1 conda-forge wrapt 1.13.3 pypi_0 pypi xz 5.2.5 h516909a_1 conda-forge zipp 3.6.0 pypi_0 pypi zlib 1.2.11 h36c2ea0_1013 conda-forge

mmckerns commented 2 years ago

It seems to very quickly go from multiprocess to pickle:

self._send_bytes(_ForkingPickler.dumps(obj))
File "/home/anaconda3/envs/vdt/lib/python3.9/site-packages/multiprocess/reduction.py", line 54, in dumps
cls(buf, protocol, *args, **kwds).dump(obj)
File "/home/anaconda3/envs/vdt/lib/python3.9/site-packages/dill/_dill.py", line 498, in dump
StockPickler.dump(self, obj)
File "/home/anaconda3/envs/vdt/lib/python3.9/pickle.py", line 485, in dump
self.save(obj)

and then hit an unpicklable object:

TypeError: cannot pickle 'envlogger.backends.python.episode_info.EpisodeInfo' object

This looks like it's an issue of the object not being serializable. Can you try the following:

import dill
dill.dumps(process_line)
dill.dumps(r.episodes[0])

and if that fails, then do this:

import dill
dill.settings['recurse'] = True
dill.dumps(process_line)
dill.dumps(r.episodes[0])

If both of the above fail, then the only thing you can do is figure out a serializable equivalent for what you want to send across the map.

mikygit commented 2 years ago

Thanx for you answer. Yes it still fails :-( Could you elaborate on "the only thing you can do is figure out a serializable equivalent"? Would this mean to have access to the code of the objects that are serialized and eventually modify the C++ code somehow?

mmckerns commented 2 years ago

What you will likely have to do is to create a python class that is derived from an episode, but includes a __reduce__ method that tells pickle how to serialize the state. The other option is to use a ThreadPool (using threading instead of multiprocessing).

mikygit commented 2 years ago

Right. Threading is not an option since it does not help in terms of speed in my case. I'll go for the other option then thanx!

mmckerns commented 2 years ago

I should mention that if you are opposed to creating a class, and want to use the episode object directly, you'll need to create a function that is equivalent to the __reduce__ method... and register it to the dispatch_table for an episode object. Relevant examples are here: https://docs.python.org/3/library/pickle.html#pickle-dispatch

mikygit commented 2 years ago

Actually I don't quiet see how to do it on an Iterator object.

mmckerns commented 2 years ago

Try registering a reduction function with copyreg

mikygit commented 2 years ago

Interesting thanx. What if I don't handle the pickling action? i.e: it is done automatically when multiprocessing ...

mmckerns commented 2 years ago

I'm not really sure what your last comment means. Multiprocess will do the pickling... it's how the object is sent across the processors. What you are doing with copyreg is teaching multiprocess how to pickle the object.

mikygit commented 2 years ago

Ok thanx for your help. I managed to fix the error using your advices (get/set_state).

mmckerns commented 2 years ago

Ok, great. I'm closing this issue.