ikostrikov / pytorch-a2c-ppo-acktr-gail

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).
MIT License
3.57k stars 829 forks source link

Multiprocessing fails with mujoco image observations #221

Closed priyankamandikal closed 4 years ago

priyankamandikal commented 4 years ago

Hi, I am trying to run the code on the mujoco Reacher-v2 env after having modified the _get_obs() function to output an image instead of joint angles:

def _get_obs(self):
    frame = self.sim.render(width=224, height=224, mode='offscreen', camera_name=None, device_id=0)
    return frame.astype(np.float32)

When I run the code using num-processes > 1, the execution freezes, until I forcefully terminate the program. It seems to hang at this point:

$ python main.py --env-name "Reacher-v2" --algo ppo --use-gae --log-interval 1 --num-steps 2048 --num-processes 4 --lr 3e-4 --entropy-coef 0 --value-loss-coef 0.5 --ppo-epoch 10 --num-mini-batch 32 --gamma 0.99 --gae-lambda 0.95 --num-env-steps 1000000 --use-linear-lr-decay --use-proper-time-limits Logging to /tmp/openai-2019-12-23-00-13-17-486602 Creating dummy env object to get spaces ^CTraceback (most recent call last): File "main.py", line 227, in main() File "main.py", line 123, in main obs = envs.reset() File "a2c_ppo_acktr/a2c_ppo_acktr/envs.py", line 172, in reset obs = self.venv.reset() File "baselines/baselines/common/vec_env/shmem_vec_env.py", line 67, in reset return self._decode_obses([pipe.recv() for pipe in self.parent_pipes]) File "baselines/baselines/common/vec_env/shmem_vec_env.py", line 67, in return self._decode_obses([pipe.recv() for pipe in self.parent_pipes]) File "miniconda3/envs/pytorch-1.2/lib/python3.7/multiprocessing/connection.py", line 250, in recv buf = self._recv_bytes() File "miniconda3/envs/pytorch-1.2/lib/python3.7/multiprocessing/connection.py", line 407, in _recv_bytes buf = self._recv(4) File "miniconda3/envs/pytorch-1.2/lib/python3.7/multiprocessing/connection.py", line 379, in _recv chunk = read(handle, remaining) KeyboardInterrupt

Note that when I set num-processes=1, the code runs fine. num-processes>1 works for mujoco joint angle inputs and atari image inputs, but not mujoco image inputs. What could be causing this behavior? Do I have to use a different rendering function in mujoco?

priyankamandikal commented 4 years ago

Replacing ShmemVecEnv with SubprocVecEnv solves the issue. Turns out shared memory was causing problems for mujoco rendering.