ppo-discrete-RNN训练问题

Lizhi-sjtu / DRL-code-pytorch

Concise pytorch implements of DRL algorithms, including REINFORCE, A2C, DQN, PPO(discrete and continuous), DDPG, TD3, SAC.

MIT License

1.1k stars 179 forks source link

Open lgzid opened 5 months ago

lgzid commented 5 months ago

在ppo-discrete-RNN代码里，不是应该要在buffer里面存储RNN的隐层状态吗，然后在更新的时候取出来恢复RNN的状态，我看代码里是每取一个mini-batchsize就reset一下隐层，这是否正确呢