YeWR / EfficientZero

Open-source codebase for EfficientZero, from "Mastering Atari Games with Limited Data" at NeurIPS 2021.
GNU General Public License v3.0
847 stars 131 forks source link

How to evaluate the model #22

Closed yueyang130 closed 2 years ago

yueyang130 commented 2 years ago

Thanks for your great work!

When I run your code, I find scores from the test bash is always a little higher than scores from the evaluation stage in the training bash (In train, the model is tested every 1w steps).

There are some results I got from the scripts. Left is from train bash and right is from test bash.

CrazyClimber 7246 9603 BankHeist 419 454

I have glanced two bash scripts and codes. In my understanding, two bash scripts evaluate agents in the completely same way where agents are evaluated with 32 seeds and get the mean of 32 scores.

So I have two questions,

  1. Why the test bash is always a little higher than scores from the evaluation stage in the training bash ?

  2. Which scripts you used to get the results in the paper?

Looking forward for your reply.

yueyang130 commented 2 years ago

I found what the problem is. 'model.p' is the best model but not the last model to be saved.