confused about the calculation of R in PPO

sweetice / Deep-reinforcement-learning-with-pytorch

PyTorch implementation of DQN, AC, ACER, A2C, A3C, PG, DDPG, TRPO, PPO, SAC, TD3 and ....

MIT License

3.88k stars 844 forks source link

confused about the calculation of R in PPO #4

Open LiuShangYuan opened 5 years ago

LiuShangYuan commented 5 years ago

hello，i am confused about the calculation of R in PPO. In file PPO_CartPole_v0.py you calc R in function update, but I think the reward in the buffer maybe come from two diffent trajectory.