ikostrikov / pytorch-a2c-ppo-acktr-gail

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).
MIT License
3.57k stars 829 forks source link

how to set args.num-steps? #198

Open ChengshuLi opened 5 years ago

ChengshuLi commented 5 years ago

Hi, thank you very much for sharing your code.

This might be a naive question but I am wondering what the general guideline is for setting args.num-steps. I saw the default value is 5, but in the README you also set it to be 128 for Atari and 2048 for MuJoCo.

1) Should this be proportional to the maximum steps in one episode? Say the agent have maximum 500 time steps before an episode terminates. What should I set for args.num-steps? 2) Or is it simply constrained by the GPU memory?

Any help will be greatly appreciated. Thanks!