PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).
The default hyperparameters work well on ALE tasks but not on mujoco. I think the hyperparameters for mujoco are different. Could you tell me what hyperparameters to use while using mujoco tasks to get results reported in the PPO paper (https://arxiv.org/abs/1707.06347).
The default hyperparameters work well on ALE tasks but not on mujoco. I think the hyperparameters for mujoco are different. Could you tell me what hyperparameters to use while using mujoco tasks to get results reported in the PPO paper (https://arxiv.org/abs/1707.06347).