Closed YuxuanXie closed 4 years ago
I realized that too. In fact, there is no target network at all in 1.dqn.ipynb.
The following line in the notebook (cell 19):
next_q_values = model(next_state)
Should be:
next_q_values = target_model(next_state)
Yes, you are right. thanks!
Hi,
I get a question about your implementation of DQN, which is supposed to have a C-interval-update between target q-network and current q-network. I see this update in your implementation of DDQN. Can you please tell me why it is this way?
In my point of view, your implementation of ddqn is actually dqn.
Best, Yuxuan