Closed zmonoid closed 5 years ago
I already do this. In test.py
, a new environment is created and env.eval()
is called. env.eval()
sets self.training = False
, which is a flag that when set true activates the code path for terminating on loss of life.
@Kaixhin Thanks for your information.
In quite a lot implementations and papers, the scores reported are actually when the game is over instead of loss of life (episodic life is only used during training).
You may consider remove episodic life for testing environment to match the score reported.