ikostrikov / pytorch-a2c-ppo-acktr-gail

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).
MIT License
3.57k stars 829 forks source link

a2c with mujoco #189

Closed NishanthVAnand closed 5 years ago

NishanthVAnand commented 5 years ago

The default hyperparameters work well on ALE tasks but not on mujoco. I think the hyperparameters for mujoco are different. Could you tell me what hyperparameters to use while using mujoco tasks to get results reported in the PPO paper (https://arxiv.org/abs/1707.06347).

ikostrikov commented 5 years ago

As far as I know hyperparameters for A2C for mujoco are lost: https://github.com/openai/baselines/issues/125

araffin commented 5 years ago

Related: https://github.com/hill-a/stable-baselines/issues/249