Closed mpnunez closed 2 months ago
Dueling DQN: Split Q value into the value of the state $V(state)$ and the advantage of each action $A(state,action)$
Double DQN: Use on-policy Q to select best action for next state, but use target network to compute its Q-value
Double DQN: https://github.com/mpnunez/Connect4-AI/commit/67c4dff0563f01d34988becacd3713a73ca1cea9 Deuling DQN: https://github.com/mpnunez/Connect4-AI/commit/423259eaffd4f8c1e867cdb34065659328da09d1
Dueling DQN: Split Q value into the value of the state $V(state)$ and the advantage of each action $A(state,action)$