Closed harshakokel closed 4 years ago
Hello,
Since the target network update logic for DQN and DDQN is tied to the self._n_train_steps_total variable, it becomes important to increment the self._n_train_steps_total at every iteration.
self._n_train_steps_total
""" Soft target network updates """ if self._n_train_steps_total % self.target_update_period == 0: ptu.soft_update_from_to( self.qf, self.target_qf, self.soft_target_tau )
Thanks, Harsha Kokel.
Thank you!
Hello,
Since the target network update logic for DQN and DDQN is tied to the
self._n_train_steps_total
variable, it becomes important to increment theself._n_train_steps_total
at every iteration.Thanks, Harsha Kokel.