Closed fuxianh closed 5 years ago
A moving average of the cumulative reward for each episode is logged to tensorboard by default. As collecting the coin (and thus completing the level) provides a reward of 10, the mean percentage of levels solved is just the mean reward multiplied by 10. In addition to being logged to tensorboard, the mean reward is also printed during training (see ppo2.py).
I do not see the "Levels Solved" variable that can be get in train_agent and test_agent files, which is the key information indicating the generalization ability, can you show me how to get this when train and test? Thank you very much.