google / dopamine

Dopamine is a research framework for fast prototyping of reinforcement learning algorithms.
https://github.com/google/dopamine
Apache License 2.0
10.42k stars 1.36k forks source link

Evaluation Returns vs Training Returns #147

Closed GoingMyWay closed 3 years ago

GoingMyWay commented 3 years ago

Dear Dopamine developers,

This repo is really a great work for researchers to conduct fast experiments. I found there are two metrics in dopamine, one is training returns and another one is evaluation returns, and there is a website of Dopamine providing comparisons between some baselines. For researchers to compare their idea with these baselines, which metric (training returns and evaluation returns) should be used?

psc-g commented 3 years ago

thanks for your message alexander, as suggested in "Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents" by Machado et al. (JAIR), training returns should always be reported. in our white paper ( https://arxiv.org/abs/1812.06110) we found there is little difference between training and eval returns. ideally (without compute and time constraints) reporting both is always better, but if you had to pick one, the two papers above suggest training returns are the metric to report.

On Sat, Aug 1, 2020 at 11:19 PM Alexander notifications@github.com wrote:

Dear Dopamine developers,

This repo is really a great work for researchers to conduct fast experiments. I found there are two metrics in dopamine, one is training returns and another one is evaluation returns, and there is a website https://google.github.io/dopamine/baselines/plots.html# of Dopamine providing comparisons between some baselines. For researchers to compare their idea with these baselines, which metric (training returns and evaluation returns) should be used?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/google/dopamine/issues/147, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE3CCMIPLLBECZOIBDQWDATR6TLL5ANCNFSM4PSI74WQ .

GoingMyWay commented 3 years ago

Thanks.