We currently train Atari models with DQN + DDQN. VPG (Vanilla Policy Gradient) have shown to be a better structured agent (as tested on Control problems / Health gathering in Doom).
Generate model + hyper params which can solve Breakout Atari retro game
We currently train Atari models with DQN + DDQN. VPG (Vanilla Policy Gradient) have shown to be a better structured agent (as tested on Control problems / Health gathering in Doom).
Generate model + hyper params which can solve Breakout Atari retro game