I created a better implementation of the dqn, more decoupled to use different environments. Anyway, after reading about this topic I have planned to improve the algorithm in different ways:
double q-learning: reduce the biased estimations. Paper
prioritized replay experience: to give importance to plays where the agent learn more. paper
I think this to things can be done this weekend. Explaining it could be more difficult.
I created a better implementation of the dqn, more decoupled to use different environments. Anyway, after reading about this topic I have planned to improve the algorithm in different ways:
I think this to things can be done this weekend. Explaining it could be more difficult.