google-deepmind / bsuite

bsuite is a collection of carefully-designed experiments that investigate core capabilities of a reinforcement learning (RL) agent
Apache License 2.0
1.51k stars 182 forks source link

Can't reproduce DQN performance #14

Closed SunCherry closed 5 years ago

SunCherry commented 5 years ago

I noticed you changed the optimizer and some hyper-parameters in DQN compared to those in the "Nature" paper, well, from my side I can't reproduce results by taking any of the two settings, could you share a learning curve of "Breakout"? I have been struggling with the hyper-parameters optimization for two months. Thanks.

aslanides commented 5 years ago

Hi there.

This agent is intended to be a simple instantiation of the DQN algorithm (Q-learning + non-linear function approximation + experience replay), and isn't intended to reproduce the Nature Atari results. There are numerous subtleties related to interfacing with Atari (frame stacking, reward clipping, etc) that can be tricky to get right. For agents that are set up to run on Atari, see Dopamine (github.com/google/dopamine) or OpenAI baselines (github.com/openai/baselines)

SunCherry commented 5 years ago

Hi there.

This agent is intended to be a simple instantiation of the DQN algorithm (Q-learning + non-linear function approximation + experience replay), and isn't intended to reproduce the Nature Atari results. There are numerous subtleties related to interfacing with Atari (frame stacking, reward clipping, etc) that can be tricky to get right. For agents that are set up to run on Atari, see Dopamine (github.com/google/dopamine) or OpenAI baselines (github.com/openai/baselines)

Got it, thanks a lot.