A framework where a deep Q-Learning Reinforcement Learning agent tries to choose the correct traffic light phase at an intersection to maximize traffic efficiency.
MIT License
405
stars
146
forks
source link
Why are Q network and target network the same? #31
Usually, the Q Network is trained while the parameters of target network are fixed. And every certain steps, the parameters of Q Network will be copied to Target Network. But when I check your code, I find that the Q Network and the Target Network are the same neural network, which confuses me. Could you please help me out?
Usually, the Q Network is trained while the parameters of target network are fixed. And every certain steps, the parameters of Q Network will be copied to Target Network. But when I check your code, I find that the Q Network and the Target Network are the same neural network, which confuses me. Could you please help me out?