improve dueling_dqn_keras loop

Before hand, Thanks you for all the knowledge that you have shared, if it was not for your videos I could not understand Q Learning.

I tested with my code and apparently they do the same

q_next = np.squeeze(q_next)
q_next[dones] = 0.0
q_tmporal = rewards + self.gamma*q_next
q_target[np.arange(self.batch_size),actions] = q_tmporal

philtabor / Youtube-Code-Repository