ikostrikov / pytorch-a2c-ppo-acktr-gail

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).
MIT License
3.62k stars 827 forks source link

error happened when running with ppo #201

Open tangypnuaa opened 5 years ago

tangypnuaa commented 5 years ago

env--reacher algo--ppo error:

Traceback (most recent call last): File "/home/al/Desktop/pytorch-a2c-ppo-acktr-gail-master/main.py", line 196, in main() File "/home/al/Desktop/pytorch-a2c-ppo-acktr-gail-master/main.py", line 42, in main args.gamma, args.log_dir, device, False) File "/home/al/Desktop/pytorch-a2c-ppo-acktr-gail-master/a2c_ppo_acktr/envs.py", line 91, in make_vec_envs envs = ShmemVecEnv(envs, context='fork') File "/home/al/Desktop/pytorch-a2c-ppo-acktr-gail-master/baselines/common/vec_env/shmem_vecenv.py", line 44, in init for in env_fns] File "/home/al/Desktop/pytorch-a2c-ppo-acktr-gail-master/baselines/common/vec_env/shmem_vecenv.py", line 44, in for in env_fns] File "/home/al/Desktop/pytorch-a2c-ppo-acktr-gail-master/baselines/common/vec_env/shmem_vec_env.py", line 43, in {k: ctx.Array(_NP_TO_CT[self.obs_dtypes[k].type], int(np.prod(self.obs_shapes[k]))) for k in self.obs_keys} KeyError: <class 'numpy.float64'>

dmitrySorokin commented 5 years ago

I got the same error

ikostrikov commented 5 years ago

Works on my linux machine. Can you provide more details?

JulianoLagana commented 5 years ago

I get the same error.

dmitrySorokin commented 5 years ago

Ubuntu 18.04 I am using latest commit from baselines (https://github.com/openai/baselines/commit/c57528573ea695b19cd03e98dae48f0082fb2b5e)

I seems like a problem with ShmemVecEnv. Switching to SubprocVecEnv helped me.

ikostrikov commented 5 years ago

Great! Can you submit a PR with the fix?

On Aug 5, 2019, at 7:25 AM, dmitrySorokin notifications@github.com wrote:

Ubuntu 18.04 I am using latest commit from baselines (openai/baselines@c575285)

I seems like a problem with ShmemVecEnv. Switching to SubprocVecEnv helped me.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

JulianoLagana commented 5 years ago

After some investigation I now think that the culprit of this is actually the baselines repo: ShmemVecEnv does not support float64 type yet. While we wait for them to fix this, one can either use SubprocVecEnv (which can be considerably slower), or try my workaround as described here.