philtabor / Youtube-Code-Repository

Repository for most of the code from my YouTube channel
861 stars 480 forks source link

dqn_keras.py choose_action function should have q_next and not q_eval #9

Closed Gonm1 closed 4 years ago

Gonm1 commented 4 years ago

I think that, since it is DQN, you have a target network for the predictions at every step and you .fit the q_eval model for the learning, also at every step. This is supposed to help with stability.

https://github.com/philtabor/Youtube-Code-Repository/blob/d70e8cfb640b648d115bd32941549182309d8366/ReinforcementLearning/DeepQLearning/dqn_keras.py#L82

philtabor commented 4 years ago

The improvement in stability comes from using the target network to evaluate the value of the new states, in the learning function.

I recommend reading the nature paper to clarify this issue.