Closed p-veloso closed 3 years ago
I can't seem to reproduce this. For me the following code works fine.
from stable_baselines3.ppo import MlpPolicy
from stable_baselines3 import PPO
# from stable_baselines3.common.vec_env import VecMonitor
import supersuit as ss
# from petting_bubble_env_continuous import PettingBubblesEnvironment
from pettingzoo.mpe import simple_push_v2
import gym
env = simple_push_v2.parallel_env()
env = ss.pad_observations_v0(env)
env = ss.black_death_v1(env)
env = ss.pettingzoo_env_to_vec_env_v0(env)
env = ss.concat_vec_envs_v0(env, 4, num_cpus=4, base_class='stable_baselines3')
model = PPO(MlpPolicy, env, verbose=2, gamma=0.999, n_steps=1000, ent_coef=0.01, learning_rate=0.00025, vf_coef=0.5, max_grad_norm=0.5, gae_lambda=0.95, n_epochs=4, clip_range=0.2, clip_range_vf=1, tensorboard_log="./ppo_test/")
model.learn(total_timesteps=1000000, tb_log_name="test", reset_num_timesteps=True)
model.save("bubble_policy_test")
Looking at your stack trace, perhaps the problem is that you are using windows? Windows only supports spawn multiprocessing which requires data to be pickled. We don't officially support windows. Windows causes an unbearable number of problems for a maintainer, and none of us use windows ourselves, so it is also hard to test solutions. We strongly recommend you find a linux or macos platform for working on these projects. Linux subsystem for windows works well for me.
If you want to make a PR to fix this yourself though, feel free. The offending local function which is killing the pickling process appears to be this one: https://github.com/PettingZoo-Team/SuperSuit/blob/master/supersuit/vector_constructors.py#L8. I suppose there might be a solution where instead of a local function it can be a class which takes in the env in its init and has an override to __call__
like this one: https://github.com/PettingZoo-Team/SuperSuit/blob/master/supersuit/vector/constructors.py#L5
Yes, I am using windows... Also, I am not very familiar with multiprocessing implementations, but I gave a try.... I substituted the original vec_env_args by
class EnvFn:
def __init__(self, env):
self.env = env
def __call__(self):
return cloudpickle.loads(cloudpickle.dumps(self.env))
def vec_env_args(env, num_envs):
env_fn = EnvFn(env)
return [env_fn] * num_envs, env.observation_space, env.action_space
It resulted in a different error (starting a new process before the ongoing process has finished its bootstrapping). As I mentioned before, this is not my expertise at all. Thanks for the hints.
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Users\pedro\Anaconda3\envs\rl_exercises\lib\multiprocessing\spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "C:\Users\pedro\Anaconda3\envs\rl_exercises\lib\multiprocessing\spawn.py", line 125, in _main
prepare(preparation_data)
File "C:\Users\pedro\Anaconda3\envs\rl_exercises\lib\multiprocessing\spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "C:\Users\pedro\Anaconda3\envs\rl_exercises\lib\multiprocessing\spawn.py", line 287, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
File "C:\Users\pedro\Anaconda3\envs\rl_exercises\lib\runpy.py", line 265, in run_path
return _run_module_code(code, init_globals, run_name,
File "C:\Users\pedro\Anaconda3\envs\rl_exercises\lib\runpy.py", line 97, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "C:\Users\pedro\Anaconda3\envs\rl_exercises\lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "C:\Users\pedro\OneDrive\Documentos\2021 Learning Matters\petting bubble rl\petting_bubble_multi_test.py", line 17, in <module>
env_multi = ss.concat_vec_envs_v0(env_multi, 8, num_cpus=8, base_class='stable_baselines3')
File "C:\Users\pedro\Anaconda3\envs\rl_exercises\lib\site-packages\supersuit\vector_constructors.py", line 48, in concat_vec_envs
vec_env = MakeCPUAsyncConstructor(num_cpus)(*vec_env_args(vec_env, num_vec_envs))
File "C:\Users\pedro\Anaconda3\envs\rl_exercises\lib\site-packages\supersuit\vector\constructors.py", line 38, in constructor
return ProcConcatVec(cat_env_fns, obs_space, act_space, num_fns * envs_per_env, example_env.metadata)
File "C:\Users\pedro\Anaconda3\envs\rl_exercises\lib\site-packages\supersuit\vector\multiproc_vec.py", line 83, in __init__
proc.start()
File "C:\Users\pedro\Anaconda3\envs\rl_exercises\lib\multiprocessing\process.py", line 121, in start
self._popen = self._Popen(self)
File "C:\Users\pedro\Anaconda3\envs\rl_exercises\lib\multiprocessing\context.py", line 224, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\Users\pedro\Anaconda3\envs\rl_exercises\lib\multiprocessing\context.py", line 327, in _Popen
return Popen(process_obj)
File "C:\Users\pedro\Anaconda3\envs\rl_exercises\lib\multiprocessing\popen_spawn_win32.py", line 45, in __init__
prep_data = spawn.get_preparation_data(process_obj._name)
File "C:\Users\pedro\Anaconda3\envs\rl_exercises\lib\multiprocessing\spawn.py", line 154, in get_preparation_data
_check_not_importing_main()
File "C:\Users\pedro\Anaconda3\envs\rl_exercises\lib\multiprocessing\spawn.py", line 134, in _check_not_importing_main
raise RuntimeError('''
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
16 cpus available
Exception ignored in: <function ProcConcatVec.__del__ at 0x000001648C105550>
Traceback (most recent call last):
File "C:\Users\pedro\Anaconda3\envs\rl_exercises\lib\site-packages\supersuit\vector\multiproc_vec.py", line 147, in __del__
for pipe in self.pipes:
AttributeError: 'ProcConcatVec' object has no attribute 'pipes'
@weepingwillowben
I understand that there is no official support for windows. However, when I run on a p2.xlarge AWS instance, it also raises errors and exceptions.
4 cpus available
Using cuda device
-------------------------------
| time/ | |
| fps | 7570 |
| iterations | 1 |
| time_elapsed | 42 |
| total_timesteps | 320000 |
-------------------------------
Segmentation fault (core dumped)
Process Process-4:
(pytorch_latest_p37) ubuntu@ip-172-31-15-100:~/bubbles$ Traceback (most recent call last):
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/site-packages/supersuit/vector/multiproc_vec.py", line 31, in async_loop
instr = pipe.recv()
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/connection.py", line 250, in recv
buf = self._recv_bytes()
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/connection.py", line 407, in _recv_bytes
buf = self._recv(4)
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/connection.py", line 383, in _recv
raise EOFError
EOFError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
self.run()
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/process.py", line 99, in run
self._target(*self._args, **self._kwargs)
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/site-packages/supersuit/vector/multiproc_vec.py", line 60, in async_loop
pipe.send((e, tb))
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/connection.py", line 206, in send
self._send_bytes(_ForkingPickler.dumps(obj))
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/connection.py", line 404, in _send_bytes
self._send(header + buf)
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/connection.py", line 368, in _send
n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
Process Process-3:
Traceback (most recent call last):
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/site-packages/supersuit/vector/multiproc_vec.py", line 31, in async_loop
instr = pipe.recv()
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/connection.py", line 250, in recv
buf = self._recv_bytes()
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/connection.py", line 407, in _recv_bytes
buf = self._recv(4)
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/connection.py", line 383, in _recv
raise EOFError
EOFError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
self.run()
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/process.py", line 99, in run
self._target(*self._args, **self._kwargs)
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/site-packages/supersuit/vector/multiproc_vec.py", line 60, in async_loop
pipe.send((e, tb))
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/connection.py", line 206, in send
self._send_bytes(_ForkingPickler.dumps(obj))
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/connection.py", line 404, in _send_bytes
self._send(header + buf)
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/connection.py", line 368, in _send
n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
Process Process-2:
Traceback (most recent call last):
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/site-packages/supersuit/vector/multiproc_vec.py", line 31, in async_loop
instr = pipe.recv()
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/connection.py", line 250, in recv
buf = self._recv_bytes()
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/connection.py", line 407, in _recv_bytes
buf = self._recv(4)
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/connection.py", line 383, in _recv
raise EOFError
EOFError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
self.run()
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/process.py", line 99, in run
self._target(*self._args, **self._kwargs)
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/site-packages/supersuit/vector/multiproc_vec.py", line 60, in async_loop
pipe.send((e, tb))
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/connection.py", line 206, in send
self._send_bytes(_ForkingPickler.dumps(obj))
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/connection.py", line 404, in _send_bytes
self._send(header + buf)
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/connection.py", line 368, in _send
n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
Process Process-1:
Traceback (most recent call last):
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/site-packages/supersuit/vector/multiproc_vec.py", line 31, in async_loop
instr = pipe.recv()
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/connection.py", line 250, in recv
buf = self._recv_bytes()
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/connection.py", line 407, in _recv_bytes
buf = self._recv(4)
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/connection.py", line 383, in _recv
raise EOFError
EOFError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
self.run()
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/process.py", line 99, in run
self._target(*self._args, **self._kwargs)
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/site-packages/supersuit/vector/multiproc_vec.py", line 60, in async_loop
pipe.send((e, tb))
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/connection.py", line 206, in send
self._send_bytes(_ForkingPickler.dumps(obj))
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/connection.py", line 404, in _send_bytes
self._send(header + buf)
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/connection.py", line 368, in _send
n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
So for the first error, you will have to familiarize yourself with multiprocessing on windows a bit.
As for the second error, the Broken pipes and EOF errors are likely caused by a segfault in the original process. Last time I ran into this problem it was because I was rendering to a screen. There could also be issues if your process is interacting with another process through a network connection or something.
I would make sure that it works for an official pettingzoo environment like MPE on your system first.
I ran a test with simple_adversary_v2 and the problem happened again (now in a g3.4xlarge). Here is the file: mpe_test.zip
16 cpus available
Using cuda device
------------------------------
| time/ | |
| fps | 5047 |
| iterations | 1 |
| time_elapsed | 2 |
| total_timesteps | 12288 |
------------------------------
Segmentation fault (core dumped)
Process Process-4:
(pytorch_latest_p37) ubuntu@ip-172-31-8-173:~/bubbles$ Traceback (most recent call last):
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/site-packages/supersuit/vector/multiproc_vec.py", line 31, in async_loop
instr = pipe.recv()
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/connection.py", line 250, in recv
buf = self._recv_bytes()
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/connection.py", line 407, in _recv_bytes
buf = self._recv(4)
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/connection.py", line 383, in _recv
raise EOFError
EOFError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
self.run()
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/process.py", line 99, in run
self._target(*self._args, **self._kwargs)
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/site-packages/supersuit/vector/multiproc_vec.py", line 60, in async_loop
pipe.send((e, tb))
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/connection.py", line 206, in send
self._send_bytes(_ForkingPickler.dumps(obj))
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/connection.py", line 404, in _send_bytes
self._send(header + buf)
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/connection.py", line 368, in _send
n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
Process Process-3:
Traceback (most recent call last):
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/site-packages/supersuit/vector/multiproc_vec.py", line 31, in async_loop
instr = pipe.recv()
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/connection.py", line 250, in recv
buf = self._recv_bytes()
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/connection.py", line 407, in _recv_bytes
buf = self._recv(4)
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/connection.py", line 383, in _recv
raise EOFError
EOFError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
self.run()
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/process.py", line 99, in run
self._target(*self._args, **self._kwargs)
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/site-packages/supersuit/vector/multiproc_vec.py", line 60, in async_loop
pipe.send((e, tb))
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/connection.py", line 206, in send
self._send_bytes(_ForkingPickler.dumps(obj))
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/connection.py", line 404, in _send_bytes
self._send(header + buf)
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/connection.py", line 368, in _send
n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
Process Process-2:
Traceback (most recent call last):
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/site-packages/supersuit/vector/multiproc_vec.py", line 31, in async_loop
instr = pipe.recv()
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/connection.py", line 250, in recv
buf = self._recv_bytes()
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/connection.py", line 407, in _recv_bytes
buf = self._recv(4)
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/connection.py", line 383, in _recv
raise EOFError
EOFError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
self.run()
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/process.py", line 99, in run
self._target(*self._args, **self._kwargs)
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/site-packages/supersuit/vector/multiproc_vec.py", line 60, in async_loop
pipe.send((e, tb))
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/connection.py", line 206, in send
self._send_bytes(_ForkingPickler.dumps(obj))
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/connection.py", line 404, in _send_bytes
self._send(header + buf)
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/connection.py", line 368, in _send
n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
Process Process-1:
Traceback (most recent call last):
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/site-packages/supersuit/vector/multiproc_vec.py", line 31, in async_loop
instr = pipe.recv()
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/connection.py", line 250, in recv
buf = self._recv_bytes()
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/connection.py", line 407, in _recv_bytes
buf = self._recv(4)
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/connection.py", line 383, in _recv
raise EOFError
EOFError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
self.run()
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/process.py", line 99, in run
self._target(*self._args, **self._kwargs)
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/site-packages/supersuit/vector/multiproc_vec.py", line 60, in async_loop
pipe.send((e, tb))
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/connection.py", line 206, in send
self._send_bytes(_ForkingPickler.dumps(obj))
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/connection.py", line 404, in _send_bytes
self._send(header + buf)
File "/home/ubuntu/anaconda3/envs/pytorch_latest_p37/lib/python3.7/multiprocessing/connection.py", line 368, in _send
n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
Thanks for providing the simple example.
Unfortunately, I still cannot reproduce this. On my system, your mpe_test.py code works just fine.
How are you running the code? What versions of python and the libraries are you running?
Also, are you only having this MPE issue on windows systems?
@justinkterry You can see that it is ubuntu from the stack trace.
Right, my apologies
I was using an environment with pytorch from the Deep Learning AMI on AWS. I will try to run again tomorrow to get the specifications.
The AMI uses Python 3.7.10
I installed: pip install git+https://github.com/vwxyzjn/stable-baselines3 pip install supersuit
These are the packages listed:
Package Version
---------------------------------- -------------------
alabaster 0.7.12
anaconda-client 1.7.2
anaconda-project 0.9.1
anyio 2.2.0
appdirs 1.4.4
argh 0.26.2
argon2-cffi 20.1.0
asn1crypto 1.4.0
astroid 2.5
astropy 4.2
async-generator 1.10
atomicwrites 1.4.0
attrs 20.3.0
autopep8 1.5.5
autovizwidget 0.18.0
Babel 2.9.0
backcall 0.2.0
backports.shutil-get-terminal-size 1.0.0
beautifulsoup4 4.9.3
bitarray 1.6.3
bkcharts 0.2
black 19.10b0
bleach 3.3.0
blis 0.7.4
bokeh 2.2.3
boto 2.49.0
boto3 1.17.12
botocore 1.20.12
Bottleneck 1.3.2
brotlipy 0.7.0
catalogue 2.0.1
certifi 2020.12.5
cffi 1.14.5
chardet 4.0.0
click 7.1.2
cloudpickle 1.6.0
clyent 1.2.2
colorama 0.4.4
contextlib2 0.6.0.post1
cryptography 3.4.6
cycler 0.10.0
cymem 2.0.5
Cython 0.29.22
cytoolz 0.11.0
dask 2021.2.0
decorator 4.4.2
defusedxml 0.6.0
diff-match-patch 20200713
dill 0.3.3
distributed 2021.2.0
docutils 0.16
entrypoints 0.3
environment-kernels 1.1.1
et-xmlfile 1.0.1
fastai 1.0.61
fastcache 1.1.0
fastprogress 1.0.0
filelock 3.0.12
flake8 3.8.4
Flask 1.1.2
Flask-Cors 3.0.10
fsspec 0.8.3
future 0.18.2
gevent 21.1.1
glob2 0.7
gmpy2 2.0.8
google-pasta 0.2.0
greenlet 1.0.0
gym 0.18.0
h5py 2.10.0
hdijupyterutils 0.18.0
HeapDict 1.0.1
html5lib 1.1
idna 2.10
imagecodecs 2021.1.11
imageio 2.9.0
imagesize 1.2.0
importlib-metadata 2.0.0
iniconfig 1.1.1
intervaltree 3.1.0
ipykernel 5.3.4
ipyparallel 6.3.0
ipython 7.20.0
ipython-genutils 0.2.0
ipywidgets 7.6.3
isort 5.7.0
itsdangerous 1.1.0
jdcal 1.4.1
jedi 0.17.2
jeepney 0.6.0
Jinja2 2.11.3
jmespath 0.10.0
joblib 1.0.1
json5 0.9.5
jsonschema 3.2.0
jupyter 1.0.0
jupyter-client 6.1.7
jupyter-console 6.2.0
jupyter-core 4.7.1
jupyter-packaging 0.7.12
jupyter-server 1.5.0
jupyterlab 3.0.12
jupyterlab-pygments 0.1.2
jupyterlab-server 2.3.0
jupyterlab-widgets 1.0.0
keyring 22.0.1
kiwisolver 1.3.1
lazy-object-proxy 1.5.2
libarchive-c 2.9
llvmlite 0.34.0
locket 0.2.1
lxml 4.6.3
MarkupSafe 1.1.1
matplotlib 3.3.4
mccabe 0.6.1
mistune 0.8.4
mkl-fft 1.3.0
mkl-random 1.1.1
mkl-service 2.3.0
mock 4.0.3
more-itertools 8.7.0
mpi4py 3.0.3
mpmath 1.1.0
msgpack 1.0.2
multipledispatch 0.6.0
murmurhash 1.0.5
mypy-extensions 0.4.3
nb-conda 2.2.1
nb-conda-kernels 2.3.1
nbclassic 0.2.6
nbclient 0.5.2
nbconvert 6.0.7
nbformat 5.1.2
nest-asyncio 1.5.1
networkx 2.5
nltk 3.5
nose 1.3.7
notebook 6.2.0
numba 0.51.2
numexpr 2.7.3
numpy 1.19.2
numpydoc 1.1.0
nvidia-ml-py3 7.352.0
olefile 0.46
onnx 1.5.0
opencv-python 3.4.13.47
openpyxl 3.0.6
packaging 20.9
pandas 1.2.2
pandocfilters 1.4.3
parso 0.7.0
partd 1.1.0
path 15.1.2
pathlib2 2.3.5
pathspec 0.7.0
pathtools 0.1.2
pathy 0.4.0
patsy 0.5.1
pep8 1.7.1
PettingZoo 1.8.0
pexpect 4.8.0
pickleshare 0.7.5
Pillow 7.2.0
pip 21.0.1
pkginfo 1.7.0
plotly 4.14.3
pluggy 0.13.1
ply 3.11
preshed 3.0.5
prometheus-client 0.9.0
prompt-toolkit 3.0.8
protobuf 3.15.6
protobuf3-to-dict 0.1.5
psutil 5.8.0
psycopg2 2.7.5
ptyprocess 0.7.0
py 1.10.0
pyarrow 3.0.0
pycodestyle 2.6.0
pycosat 0.6.3
pycparser 2.20
pycrypto 2.6.1
pycurl 7.43.0.6
pydantic 1.7.3
pydocstyle 5.1.1
pyerfa 1.7.2
pyflakes 2.2.0
pyfunctional 1.4.3
pygal 2.4.0
pyglet 1.5.0
Pygments 2.8.0
pykerberos 1.2.1
pylint 2.7.0
pyls-black 0.4.6
pyls-spyder 0.3.2
pynvml 8.0.4
pyodbc 4.0.0-unsupported
pyOpenSSL 20.0.1
pyparsing 2.4.7
pyrsistent 0.17.3
PySocks 1.7.1
pytest 6.2.2
python-dateutil 2.8.1
python-jsonrpc-server 0.4.0
python-language-server 0.36.2
pytz 2021.1
PyWavelets 1.1.1
pyxdg 0.27
PyYAML 5.4.1
pyzmq 20.0.0
QDarkStyle 2.8.1
QtAwesome 1.0.1
qtconsole 5.0.2
QtPy 1.9.0
regex 2020.11.13
requests 2.25.1
requests-kerberos 0.12.0
retrying 1.3.3
rope 0.18.0
Rtree 0.9.4
ruamel-yaml 0.15.87
s3fs 0.2.0
s3transfer 0.3.4
sagemaker 2.31.1
scikit-image 0.17.2
scikit-learn 0.23.2
scipy 1.6.1
seaborn 0.11.1
SecretStorage 3.3.1
Send2Trash 1.5.0
setuptools 49.6.0.post20210108
simplegeneric 0.8.1
singledispatch 0.0.0
six 1.15.0
sklearn 0.0
smart-open 3.0.0
smclarify 0.1
smdebug-rulesconfig 1.0.1
sniffio 1.2.0
snowballstemmer 2.1.0
sortedcollections 2.1.0
sortedcontainers 2.3.0
soupsieve 2.2
spacy 3.0.5
spacy-legacy 3.0.1
sparkmagic 0.15.0
Sphinx 3.5.1
sphinxcontrib-applehelp 1.0.2
sphinxcontrib-devhelp 1.0.2
sphinxcontrib-htmlhelp 1.0.3
sphinxcontrib-jsmath 1.0.1
sphinxcontrib-qthelp 1.0.3
sphinxcontrib-serializinghtml 1.1.4
sphinxcontrib-websupport 1.2.4
spyder 4.2.1
spyder-kernels 1.10.2
SQLAlchemy 1.3.23
srsly 2.4.0
stable-baselines3 1.1.0a1
statsmodels 0.12.2
SuperSuit 2.6.2
sympy 1.7.1
tables 3.6.1
tabulate 0.8.9
tblib 1.7.0
terminado 0.9.2
testpath 0.4.4
textdistance 4.2.1
thinc 8.0.2
threadpoolctl 2.1.0
three-merge 0.1.1
tifffile 2021.1.14
toml 0.10.1
toolz 0.11.1
torch 1.8.0+cu111
torch-model-archiver 0.2.1
torchserve 0.2.1
torchvision 0.9.0+cu111
tornado 6.1
tqdm 4.56.0
traitlets 5.0.5
typed-ast 1.4.2
typer 0.3.2
typing 3.7.4.3
typing-extensions 3.7.4.3
ujson 4.0.2
unicodecsv 0.14.1
urllib3 1.26.4
wasabi 0.8.2
watchdog 1.0.2
wcwidth 0.2.5
webencodings 0.5.1
Werkzeug 1.0.1
wheel 0.36.2
widgetsnbextension 3.5.1
wrapt 1.12.1
wurlitzer 2.0.1
xlrd 2.0.1
XlsxWriter 1.3.7
xlwt 1.3.0
yapf 0.30.0
zict 2.0.0
zipp 3.4.0
zope.event 4.5.0
zope.interface 5.2.0
Sorry. I still can't reproduce this. I will likely have to launch an AWS instance to test this out.
@weepingwillowben, do you see any potential fix for multiprocessing in window or aws ubuntu in the near future?
Hi, following up on this.
I started an AWS instance with the Deep Learning AMI (Ubuntu 18.04) Version 42.1.
I ran
pip install supersuit
pip install stable_baselines3
Ran your mpe_test.py
And everything worked fine.
Sounds good. The only difference seems to be that I was using
pip install git+https://github.com/vwxyzjn/stable-baselines3
But now the monitor for the vectorized environments is integrated to the official sb3. Also, which of the available environments in the AMI did you use?
I will try again later. Thanks again.
I don't think I used any particular anaconda environment. I just pip installed everything immediately after logging in.
When I logged in, the interface suggests to use one of the many available environments. I used source activate to access the latest PyTorch environment then I installed the libraries I mentioned earlier. I will try to run the code later today.
@weepingwillowben, mpe_test.py works well outside of the anaconda environments. However, I still have problems with my custom environments, such as raumplan aws.zip. I cannot run it with your specifications because VecMonitor requires
pip install git+https://github.com/vwxyzjn/stable-baselines3
Then, after I install it, and I run it on aws (or on google colab), the sb3 class VecEnvWrapper triggers a max recursion error.
ubuntu@ip-172-31-0-98:~/[raumplan aws.zip](https://github.com/PettingZoo-Team/SuperSuit/files/6348377/raumplan.aws.zip)$ python raumplan_supersuit_train_for_pavilion.py
Traceback (most recent call last):
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/stable_baselines3/common/vec_env/base_vec_env.py", line 347, in getattr_depth_check
all_attributes = self._get_all_attributes()
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/stable_baselines3/common/vec_env/base_vec_env.py", line 321, in _get_all_attributes
all_attributes.update(self.class_attributes)
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/stable_baselines3/common/vec_env/base_vec_env.py", line 304, in __getattr__
blocked_class = self.getattr_depth_check(name, already_found=False)
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/stable_baselines3/common/vec_env/base_vec_env.py", line 347, in getattr_depth_check
all_attributes = self._get_all_attributes()
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/stable_baselines3/common/vec_env/base_vec_env.py", line 321, in _get_all_attributes
all_attributes.update(self.class_attributes)
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/stable_baselines3/common/vec_env/base_vec_env.py", line 304, in __getattr__
blocked_class = self.getattr_depth_check(name, already_found=False)
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/stable_baselines3/common/vec_env/base_vec_env.py", line 347, in getattr_depth_check
all_attributes = self._get_all_attributes()
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/stable_baselines3/common/vec_env/base_vec_env.py", line 321, in _get_all_attributes
all_attributes.update(self.class_attributes)
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/stable_baselines3/common/vec_env/base_vec_env.py", line 304, in __getattr__
blocked_class = self.getattr_depth_check(name, already_found=False)
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/stable_baselines3/common/vec_env/base_vec_env.py", line 347, in getattr_depth_check
all_attributes = self._get_all_attributes()
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/stable_baselines3/common/vec_env/base_vec_env.py", line 321, in _get_all_attributes
all_attributes.update(self.class_attributes)
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/stable_baselines3/common/vec_env/base_vec_env.py", line 304, in __getattr__
blocked_class = self.getattr_depth_check(name, already_found=False)
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/stable_baselines3/common/vec_env/base_vec_env.py", line 347, in getattr_depth_check
all_attributes = self._get_all_attributes()
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/stable_baselines3/common/vec_env/base_vec_env.py", line 321, in _get_all_attributes
all_attributes.update(self.class_attributes)
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/stable_baselines3/common/vec_env/base_vec_env.py", line 304, in __getattr__
[repeats these 3 lines of errors multiple times]
RecursionError: maximum recursion depth exceeded
Sorry for the delay. This should be a separate issue from the first one.
If you remove this bit of code:
clip_range_vf = env.spec.reward_threshold,
it should work.
While this error message is a terrible one, and I'll look into replacing it with a sane AttributeError, getting an attribute of the base environment spec is not a feature we are looking to support in vector environments (especially not multiprocessing ones). Just get the reward threshold of the environment before it is wrapped in a vector environment.
Note that pettingzoo environments support getting attributes of the underlying environment through the .unwrapped
attribute, which returns the unwrapped environment. So if env
was a pettingzoo environment, rather than a vector environment, you could just do
env.unwrapped.spec.reward_threshold
@weepingwillowben I think it still crashes when I test in Colab (I will test it on AWS later). I have just uploaded an example with the code.
Thanks again for the example. We just found the maximum recursion depth internally ourselves. I'll make sure to release that fix today.
This fix was released. Let me know if you have more issues (especially if the initial issue crops up again).
Awesome. It works in google colab with and without multiprocessing.
I will double check the previous problems on Wed.
If everything is fine, I will close this issue.
Just one note. In the colab notebook, you are still pip installing from the PR fork of stable_baselines3. That PR got merged, so you should now be able to pull from the SB3 master branch
Hi, I'm actually still having the same issue with stable-baselines3==1.1.0a3 and with the example that @weepingwillowben suggested in this comment.
Any suggestions on what I should try next?
(Running this on a Mac)
Traceback (most recent call last):
File "/Users/aislingpigott/Documents/legendary-pancake/pettingzooTest.py", line 13, in <module>
env = ss.concat_vec_envs_v0(env, 4, num_cpus=4, base_class='stable_baselines3')
File "/opt/anaconda3/envs/zoo/lib/python3.9/site-packages/supersuit/vector_constructors.py", line 47, in concat_vec_envs
vec_env = MakeCPUAsyncConstructor(num_cpus)(*vec_env_args(vec_env, num_vec_envs))
File "/opt/anaconda3/envs/zoo/lib/python3.9/site-packages/supersuit/vector/constructors.py", line 38, in constructor
return ProcConcatVec(cat_env_fns, obs_space, act_space, num_fns * envs_per_env, example_env.metadata)
File "/opt/anaconda3/envs/zoo/lib/python3.9/site-packages/supersuit/vector/multiproc_vec.py", line 89, in __init__
proc.start()
File "/opt/anaconda3/envs/zoo/lib/python3.9/multiprocessing/process.py", line 121, in start
self._popen = self._Popen(self)
File "/opt/anaconda3/envs/zoo/lib/python3.9/multiprocessing/context.py", line 224, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "/opt/anaconda3/envs/zoo/lib/python3.9/multiprocessing/context.py", line 284, in _Popen
return Popen(process_obj)
File "/opt/anaconda3/envs/zoo/lib/python3.9/multiprocessing/popen_spawn_posix.py", line 32, in __init__
super().__init__(process_obj)
File "/opt/anaconda3/envs/zoo/lib/python3.9/multiprocessing/popen_fork.py", line 19, in __init__
self._launch(process_obj)
File "/opt/anaconda3/envs/zoo/lib/python3.9/multiprocessing/popen_spawn_posix.py", line 47, in _launch
reduction.dump(process_obj, fp)
File "/opt/anaconda3/envs/zoo/lib/python3.9/multiprocessing/reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'vec_env_args.<locals>.env_fn'
Exception ignored in: <function ProcConcatVec.__del__ at 0x179f8b160>
Traceback (most recent call last):
File "/opt/anaconda3/envs/zoo/lib/python3.9/site-packages/supersuit/vector/multiproc_vec.py", line 153, in __del__
for pipe in self.pipes:
AttributeError: 'ProcConcatVec' object has no attribute 'pipes'
@p-veloso same issue
@apigott, as far as I remember that error disappeared when I tested in colab and aws (ubuntu). With that said, these are the fixes that I used over time
@apigott Are you using the latest version of supersuit? I thought that this issue was fixed.
I pulled supersuit 2.6.4 today with pip install supersuit
Are you using a windows system? MSYS, perhaps?
No, mac
Ah, now the issue is clear.
Spawn vs fork is a change in the default in python3.8 and newer on macs only.
Looks like gym has a clever way of getting around this problem here: https://github.com/openai/gym/blob/a5a6ae6bc0a5cfc0ff1ce9be723d59593c165022/gym/vector/utils/misc.py#L6
I can make a PR to use this CloudPickleWrapper in supersuit's multiprocessing utilities.
In the meantime, you can call
import multiprocessing
multiprocessing.set_start_method("fork")
before creating the environment to fix this issue.
Actually upon investigation this is the only solution without major reworking of multiprocessing support. Please use the above multiprocessing set_start_method
call to fix this issue.
@weepingwillowben Thanks. I was able to use those lines as well as export KMP_DUPLICATE_LIB_OK=TRUE
to get a working test file. (I should mention that this git issue also suggests that conda install nomkl
works but I didn't have luck with that either.) I'm not particularly well versed in multiprocessing, as I've only ever used the pathos lib, but here was the stack trace I got without exporting KMP_DUPLICATE_LIB_OK=TRUE.
(zoo) Aislings-MacBook-Pro:legendary-pancake aislingpigott$ python pettingzooTest.py
Using cpu device
OMP: Error #15: Initializing libiomp5.dylib, but found libomp.dylib already initialized.
OMP: Hint This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://www.intel.com/software/products/support/.
Abort trap: 6
Process Process-4:
Traceback (most recent call last):
File "/opt/anaconda3/envs/zoo/lib/python3.9/site-packages/supersuit/vector/multiproc_vec.py", line 31, in async_loop
instr = pipe.recv()
File "/opt/anaconda3/envs/zoo/lib/python3.9/multiprocessing/connection.py", line 255, in recv
buf = self._recv_bytes()
File "/opt/anaconda3/envs/zoo/lib/python3.9/multiprocessing/connection.py", line 419, in _recv_bytes
buf = self._recv(4)
File "/opt/anaconda3/envs/zoo/lib/python3.9/multiprocessing/connection.py", line 388, in _recv
raise EOFError
EOFError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/anaconda3/envs/zoo/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/opt/anaconda3/envs/zoo/lib/python3.9/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/opt/anaconda3/envs/zoo/lib/python3.9/site-packages/supersuit/vector/multiproc_vec.py", line 66, in async_loop
pipe.send((e, tb))
File "/opt/anaconda3/envs/zoo/lib/python3.9/multiprocessing/connection.py", line 211, in send
self._send_bytes(_ForkingPickler.dumps(obj))
File "/opt/anaconda3/envs/zoo/lib/python3.9/multiprocessing/connection.py", line 416, in _send_bytes
self._send(header + buf)
File "/opt/anaconda3/envs/zoo/lib/python3.9/multiprocessing/connection.py", line 373, in _send
n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
(zoo) Aislings-MacBook-Pro:legendary-pancake aislingpigott$ Process Process-3:
Traceback (most recent call last):
File "/opt/anaconda3/envs/zoo/lib/python3.9/site-packages/supersuit/vector/multiproc_vec.py", line 31, in async_loop
instr = pipe.recv()
File "/opt/anaconda3/envs/zoo/lib/python3.9/multiprocessing/connection.py", line 255, in recv
buf = self._recv_bytes()
File "/opt/anaconda3/envs/zoo/lib/python3.9/multiprocessing/connection.py", line 419, in _recv_bytes
buf = self._recv(4)
File "/opt/anaconda3/envs/zoo/lib/python3.9/multiprocessing/connection.py", line 388, in _recv
raise EOFError
EOFError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/anaconda3/envs/zoo/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/opt/anaconda3/envs/zoo/lib/python3.9/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/opt/anaconda3/envs/zoo/lib/python3.9/site-packages/supersuit/vector/multiproc_vec.py", line 66, in async_loop
pipe.send((e, tb))
File "/opt/anaconda3/envs/zoo/lib/python3.9/multiprocessing/connection.py", line 211, in send
self._send_bytes(_ForkingPickler.dumps(obj))
File "/opt/anaconda3/envs/zoo/lib/python3.9/multiprocessing/connection.py", line 416, in _send_bytes
self._send(header + buf)
File "/opt/anaconda3/envs/zoo/lib/python3.9/multiprocessing/connection.py", line 373, in _send
n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
Process Process-2:
Traceback (most recent call last):
File "/opt/anaconda3/envs/zoo/lib/python3.9/site-packages/supersuit/vector/multiproc_vec.py", line 31, in async_loop
instr = pipe.recv()
File "/opt/anaconda3/envs/zoo/lib/python3.9/multiprocessing/connection.py", line 255, in recv
buf = self._recv_bytes()
File "/opt/anaconda3/envs/zoo/lib/python3.9/multiprocessing/connection.py", line 419, in _recv_bytes
buf = self._recv(4)
File "/opt/anaconda3/envs/zoo/lib/python3.9/multiprocessing/connection.py", line 388, in _recv
raise EOFError
EOFError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/anaconda3/envs/zoo/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/opt/anaconda3/envs/zoo/lib/python3.9/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/opt/anaconda3/envs/zoo/lib/python3.9/site-packages/supersuit/vector/multiproc_vec.py", line 66, in async_loop
pipe.send((e, tb))
File "/opt/anaconda3/envs/zoo/lib/python3.9/multiprocessing/connection.py", line 211, in send
self._send_bytes(_ForkingPickler.dumps(obj))
File "/opt/anaconda3/envs/zoo/lib/python3.9/multiprocessing/connection.py", line 416, in _send_bytes
self._send(header + buf)
File "/opt/anaconda3/envs/zoo/lib/python3.9/multiprocessing/connection.py", line 373, in _send
n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
Process Process-1:
Traceback (most recent call last):
File "/opt/anaconda3/envs/zoo/lib/python3.9/site-packages/supersuit/vector/multiproc_vec.py", line 31, in async_loop
instr = pipe.recv()
File "/opt/anaconda3/envs/zoo/lib/python3.9/multiprocessing/connection.py", line 255, in recv
buf = self._recv_bytes()
File "/opt/anaconda3/envs/zoo/lib/python3.9/multiprocessing/connection.py", line 419, in _recv_bytes
buf = self._recv(4)
File "/opt/anaconda3/envs/zoo/lib/python3.9/multiprocessing/connection.py", line 388, in _recv
raise EOFError
EOFError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/anaconda3/envs/zoo/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/opt/anaconda3/envs/zoo/lib/python3.9/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/opt/anaconda3/envs/zoo/lib/python3.9/site-packages/supersuit/vector/multiproc_vec.py", line 66, in async_loop
pipe.send((e, tb))
File "/opt/anaconda3/envs/zoo/lib/python3.9/multiprocessing/connection.py", line 211, in send
self._send_bytes(_ForkingPickler.dumps(obj))
File "/opt/anaconda3/envs/zoo/lib/python3.9/multiprocessing/connection.py", line 416, in _send_bytes
self._send(header + buf)
File "/opt/anaconda3/envs/zoo/lib/python3.9/multiprocessing/connection.py", line 373, in _send
n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
Hmm. This is an unfortunate error. I wonder when the omp runtime is initialized (import time or when building the model), and whether it can reasonably be done after creating the environments.
Are you still running the same snippet of code from earlier in this github issue?
Yeah, running the same code snippet. The error comes after the imports. "Using cpu device" is output of model = PPO but it doesn't finish initializing the model from what I can tell with some debug print statements
ETA: I'm not sure what you mean by it being possible to move the mp call until after the env is initialized. The initial error stems from initialization of the env right? So it's not possible to move the mp.set_start_method() call until after the model initialization
from stable_baselines3.ppo import MlpPolicy
from stable_baselines3 import PPO
# from stable_baselines3.common.vec_env import VecMonitor
import supersuit as ss
# from petting_bubble_env_continuous import PettingBubblesEnvironment
from pettingzoo.mpe import simple_push_v2
import gym
import multiprocessing
print("Imports ok")
multiprocessing.set_start_method("fork")
env = simple_push_v2.parallel_env()
env = ss.pad_observations_v0(env)
env = ss.black_death_v1(env)
env = ss.pettingzoo_env_to_vec_env_v0(env)
env = ss.concat_vec_envs_v0(env, 4, num_cpus=4, base_class='stable_baselines3')
model = PPO(MlpPolicy, env, verbose=2, gamma=0.999, n_steps=1000, ent_coef=0.01, learning_rate=0.00025, vf_coef=0.5, max_grad_norm=0.5, gae_lambda=0.95, n_epochs=4, clip_range=0.2, clip_range_vf=1, tensorboard_log="./ppo_test/")
print("Script is stopped by broken pipe error")
model.learn(total_timesteps=100, tb_log_name="test", reset_num_timesteps=True)
model.save("bubble_policy_test")
Yeah, it is probably initializing once when importing pytorch, then initializing again when using it. Very strange. From reading, it appears that the library is somehow not dealt with correctly during the fork.
I guess the correct long term solution is to support spawn multiprocessing natively like Gym does. I did not anticipate that this would be a problem.
I will create an issue to track this feature.
Based on the template for multiprocessing in SB3 I decided to check if I could use multiprocessing in SuperSuit. Here are my files: petting bubble rl.zip
However, the version with multiprocessing fails...