openai / baselines

OpenAI Baselines: high-quality implementations of reinforcement learning algorithms
MIT License
15.74k stars 4.87k forks source link

A2C mutliprocessing w/ ROS Gazebo #1120

Open han-kyung-min opened 4 years ago

han-kyung-min commented 4 years ago

First of all, thanks a lot for the nice SW.

I am trying to run A2C method on multiple Gazebo environments created by ROS.

Before creating multiple envs, I first created only one env (ROS process) using SubprocVecEnv(). Unfortunately, I cannot proceed further b/c I don't know how to deal with the error message below. Everything works OK if I use DummyVecEnv() method. Perhaps, the error is associated with communication protocol...

The error is at observation_space, action_space, self.spec = self.remotes[0].recv() which resides in class SubprocVecEnv(VecEnv): of subproc_vec_env.py

Could you please tell me how to fix/debug this problem?

Best,

Invalid MIT-MAGIC-COOKIE-1 keyTraceback (most recent call last): File "", line 1, in File "/usr/lib/python3.8/multiprocessing/spawn.py", line 116, in spawn_main exitcode = _main(fd, parent_sentinel) File "/usr/lib/python3.8/multiprocessing/spawn.py", line 125, in _main prepare(preparation_data) File "/usr/lib/python3.8/multiprocessing/spawn.py", line 236, in prepare _fixup_main_from_path(data['init_main_from_path']) File "/usr/lib/python3.8/multiprocessing/spawn.py", line 287, in _fixup_main_from_path main_content = runpy.run_path(main_path, File "/usr/lib/python3.8/runpy.py", line 263, in run_path return _run_module_code(code, init_globals, run_name, File "/usr/lib/python3.8/runpy.py", line 96, in _run_module_code _run_code(code, mod_globals, init_globals, File "/usr/lib/python3.8/runpy.py", line 86, in _run_code exec(code, run_globals) File "/home/hankm/catkin_ws/src/openai_examples_projects/my_turtlebot3_openai_example/scripts/start_a3c_env.py", line 38, in from openai_ros.task_envs.turtlebot3 import turtlebot3_world File "/home/hankm/catkin_ws/src/openai_ros/openai_ros/src/openai_ros/task_envs/turtlebot3/turtlebot3_world.py", line 4, in from openai_ros.robot_envs import turtlebot3_env File "/home/hankm/catkin_ws/src/openai_ros/openai_ros/src/openai_ros/robot_envs/turtlebot3_env.py", line 4, in from openai_ros import robot_gazebo_env File "/home/hankm/catkin_ws/src/openai_ros/openai_ros/src/openai_ros/robot_gazebo_env.py", line 7, in from openai_ros.msg import RLExperimentInfo ModuleNotFoundError: No module named 'openai_ros.msg' Traceback (most recent call last): File "/home/hankm/catkin_ws/src/openai_examples_projects/my_turtlebot3_openai_example/scripts/start_a3c_env.py", line 82, in env = StartOpenAI_ROS_Environment('TurtleBot3World-v0', 'a2c', 1) File "/home/hankm/catkin_ws/src/openai_ros/openai_ros/src/openai_ros/openai_ros_common.py", line 38, in StartOpenAI_ROS_Environment env = make_vec_env(task_and_robot_environment_name, env_type, num_env, seed) # SubprocVecEnv in subproc_vec_env.py File "/home/hankm/python_ws/baselines/baselines/common/cmd_util.py", line 57, in make_vec_env return SubprocVecEnv([make_thunk(i + start_index) for i in range(num_env)]) File "/home/hankm/python_ws/baselines/baselines/common/vec_env/subproc_vec_env.py", line 66, in init observation_space, action_space, self.spec = self.remotes[0].recv() File "/usr/lib/python3.8/multiprocessing/connection.py", line 250, in recv buf = self._recv_bytes() File "/usr/lib/python3.8/multiprocessing/connection.py", line 414, in _recv_bytes buf = self._recv(4) File "/usr/lib/python3.8/multiprocessing/connection.py", line 379, in _recv chunk = read(handle, remaining) ConnectionResetError: [Errno 104] Connection reset by peer Exception ignored in: <function SubprocVecEnv.del at 0x7f4ff408b0d0> Traceback (most recent call last): File "/home/hankm/python_ws/baselines/baselines/common/vec_env/subproc_vec_env.py", line 116, in del self.close() File "/home/hankm/python_ws/baselines/baselines/common/vec_env/vec_env.py", line 98, in close self.close_extras() File "/home/hankm/python_ws/baselines/baselines/common/vec_env/subproc_vec_env.py", line 99, in close_extras remote.send(('close', None)) File "/usr/lib/python3.8/multiprocessing/connection.py", line 206, in send self._send_bytes(_ForkingPickler.dumps(obj)) File "/usr/lib/python3.8/multiprocessing/connection.py", line 411, in _send_bytes self._send(header + buf) File "/usr/lib/python3.8/multiprocessing/connection.py", line 368, in _send n = write(self._handle, buf) BrokenPipeError: [Errno 32] Broken pipe [turtlebot3_world-2] process has died [pid 9340, exit code 1, cmd /home/hankm/catkin_ws/src/openai_examples_projects/my_turtlebot3_openai_example/scripts/start_a3c_env.py __name:=turtlebot3_world __log:=/home/hankm/.ros/log/0025e25a-bc40-11ea-8364-5de135f213cf/turtlebot3_world-2.log]. log file: /home/hankm/.ros/log/0025e25a-bc40-11ea-8364-5de135f213cf/turtlebot3_world-2*.log

weiyuhe commented 3 years ago

Have you figured out what the problem was?