Closed ThisIsIsaac closed 5 years ago
I have implemented a separate target network. Compared to 100 epochs on both the original & separate target network version, the latter display 30% increase in performance.
I have implemented a target network in my private work, following my thesis. Yes, I know that a TN gives a boost to performance, but the intent of this repo is to give a starting point to people that want to work on this topic. Therefore I tried to simplify as much as possible the overall work (which it still achieves good performance even without TN), and let others experiments and expand the work as they like.
You should rename the repo or explicitly mention that it is not a complete implementation of DQN. It is very misleading because no where in the code, readme, or your thesis is there any mention of your intent to only partially implement DQN.
As you stated, this is not an implementation of the DQN system proposed in the DQN paper. Here I have implemented my own very simple version of a deep q-learning system, which is not directly inspired by DQN, and that is why you can't see any mention of DQN. In this work, I have included only the parts that I found a good trade-off between performance and understandability. For example, I also would like to point out that DQN also uses a Convolutional NN, but in this work, a Feedforward NN is used.
You are free and encouraged to implement a vanilla DQN system applied to Traffic Control, but this is not what this repo is about. This repo is just to give a practical starting point to anyone that wants to dive in this topic using SUMO, also because when I started working on it I found out that there weren't any good resources online.
@ThisIsIsaac sorry to bother u, can u share the code which implemented a separate target network? I want to use it for a better learning.
@ThisIsIsaac thanks a lot!
Question
Is there a reason why you did not implement a separate target network? Unless you have tried it and made a decision to not use separate target network, it is one of the two biggest improvements that the DQN paper suggested. I didn't see any explanation from your paper either. Would love to hear what you think. Thanks
Why separate network is important
Below is from the DQN paper published in Nature
And the table it refers to: