High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)
5.02k
stars
575
forks
source link
Target network isn't updated to the correct frequency when `target_network_frequency % train_frequency != 0` #322
Closed
qgallouedec closed 1 year ago
Because this
if
statementhttps://github.com/vwxyzjn/cleanrl/blob/c37a3ec4ef8d33ab7c8a69d4d2714e3817739365/cleanrl/dqn.py#L205
is inside this one
https://github.com/vwxyzjn/cleanrl/blob/c37a3ec4ef8d33ab7c8a69d4d2714e3817739365/cleanrl/dqn.py#L185
Consequently, the target network is updated when
global_step % train_frequency == 0 and global_step % target_network_frequency == 0
.For example, when you run
The target network is updated every 5010 timesteps, not every 501 timesteps.