Lizhi-sjtu / DRL-code-pytorch

Concise pytorch implements of DRL algorithms, including REINFORCE, A2C, DQN, PPO(discrete and continuous), DDPG, TD3, SAC.
MIT License
1.1k stars 179 forks source link

ppo-discrete-RNN训练问题 #14

Open lgzid opened 5 months ago

lgzid commented 5 months ago

在ppo-discrete-RNN代码里,不是应该要在buffer里面存储RNN的隐层状态吗,然后在更新的时候取出来恢复RNN的状态,我看代码里是每取一个mini-batchsize就reset一下隐层,这是否正确呢