rail-berkeley / rlkit

Collection of reinforcement learning algorithms
MIT License
2.45k stars 550 forks source link

Fix for DQN and DDQN target network update #114

Closed harshakokel closed 4 years ago

harshakokel commented 4 years ago

Hello,

Since the target network update logic for DQN and DDQN is tied to the self._n_train_steps_total variable, it becomes important to increment the self._n_train_steps_total at every iteration.

        """
        Soft target network updates
        """
        if self._n_train_steps_total % self.target_update_period == 0:
            ptu.soft_update_from_to(
                self.qf, self.target_qf, self.soft_target_tau
            )

Thanks, Harsha Kokel.

vitchyr commented 4 years ago

Thank you!