fix #26 for updating TD error formula for PER in ddqn

germain-hug / Deep-RL-Keras

Keras Implementation of popular Deep RL Algorithms (A3C, DDQN, DDPG, Dueling DDQN)

533 stars 149 forks source link

Closed parasnaren closed 4 years ago

parasnaren commented 4 years ago

Fix for the issue #26 with regards to the formula for calculating TD error for PER.

delta(j) = Reward(j) + gamma(j) * Q_target(S_j, arg max_a Q(S_j, a)) - Q(S_j-1, A_j-1)

Here, the Q(S_j, a) is the Q value predict by the model on the new state and not the old state.