Issues with the human scores in atari

kaustubhsridhar commented 1 year ago

Hi,

Thank you for this script which provides the random and human scores in a helpful format: https://github.com/google-deepmind/dqn_zoo/blob/master/dqn_zoo/atari_data.py

But, comparing the score of alien for example with that in the DQN nature paper's Extended Data Table 2 (https://www.nature.com/articles/nature14236) there seem to be discrepancies: the script says human score for alien is 7127.7 and the nature paper says 6875.

Have the scores been updated since the original paper? Could you please point me to where I can find these scores?

Thanks a lot :)

kaustubhsridhar commented 1 year ago

Also, just FYI, Carnival and Pooyan's scores are missing in the script linked above.

jqdm commented 1 year ago

Apologies for the delay in responding.

Using Alien as an example, the Nature paper score of 6875.4 corresponds to the "human_at5" testing condition which limits the number of frames in an episode to 5 minutes at 60FPS.

The score for 7127.7 in atari_data.py corresponds to the "human_at30" testing condition which limits to 30 minutes or 108,000 frames. The paper "A Distributional Perspective on Reinforcement Learning" in particular makes use of this testing condition.

A fair number of papers use "human_at30_rnd_starts" which also includes start states sampled randomly from human traces.

Regarding Carnival and Pooyan, these games are not in the standard 57 Atari games so were omitted.

google-deepmind / dqn_zoo

Issues with the human scores in atari #28