PyTorch implementation of Deep Reinforcement Learning: Policy Gradient methods (TRPO, PPO, A2C) and Generative Adversarial Imitation Learning (GAIL). Fast Fisher vector product TRPO.
1.1k
stars
188
forks
source link
is this an error:num_steps += (t + 1) ? #20
Closed
pprivulet closed 4 years ago
Is this an error in file core/agent.py line 59: num_steps += (t + 1) ?