google / dopamine

Dopamine is a research framework for fast prototyping of reinforcement learning algorithms.
https://github.com/google/dopamine
Apache License 2.0
10.42k stars 1.36k forks source link

Is it possible to release the evaluation scores of the baseline agents? #210

Open zhixuan-lin opened 1 year ago

zhixuan-lin commented 1 year ago

First, thanks for this awesome codebase, it helped me a lot :) I have three questions

  1. Based on #147 and the white paper, the results in the baseline folder are training returns instead of evaluation returns. Would it be possible to also release the evaluation returns, if they are available? The reason I'm asking is that I'm running some ensemble methods that behave very differently during training and evaluation. Even though the white paper shows that for the agents in the repo, using evaluation or training returns does not matter much for the 3 games tested, I'm not sure whether this is still the case for the other 57 games. And even if so, for apple-to-apple comparisons I would prefer to compare the evaluation returns.
  2. Why are there only 199 iterations (indexed from 0 to 198) for the baseline results, given that we always run 200 iterations?
  3. In the MICo paper, did you report training returns or evaluation returns?

Thanks!