Incredibly high score for Defender

google / dopamine

Dopamine is a research framework for fast prototyping of reinforcement learning algorithms.

https://github.com/google/dopamine

Apache License 2.0

10.44k stars 1.37k forks source link

Incredibly high score for Defender #104

Open cathera opened 5 years ago

cathera commented 5 years ago

Has anyone tried running Defender with dopamine? I noticed that baselines of Defender are not provided so I was trying to run one, but the scores are incredibly high. Here are the reference scores in Rainbow: And this is what I get with C51:

LinZichuan commented 5 years ago

I had the same problem... By the way, I found there is no Surround environment in OpenAI gym. How can I reproduce the results of IQN and C51 on this environment?

marintoro commented 5 years ago

Hello, I think this is due to a bug in the underlying ALE library. The reward are multiplied by 100 without reason compared to the true score of the actual game. See mgbellemare/Arcade-Learning-Environment#262 for more details.

cathera commented 5 years ago

Hello, I think this is due to a bug in the underlying ALE library. The reward are multiplied by 100 without reason compared to the true score of the actual game. See mgbellemare/Arcade-Learning-Environment#262 for more details.

Hi, thanks for the information! I wonder how all previous works got their results though.....