Closed ShawnLue closed 7 years ago
The update of target network is noneffective. I have verified this in my experiments, that the qfunc_loss is quickly move to 0 as a result of the static state of target network.
The update of target network is noneffective. I have verified this in my experiments, that the qfunc_loss is quickly move to 0 as a result of the static state of target network.