Closed Kautenja closed 6 years ago
Gym has Atari specific wrappers for this sort of behavior so it doesnt need to be built into the agents https://github.com/openai/baselines/blob/master/baselines/common/atari_wrappers.py
wrappers are implemented, currently the agent is not penalized for termination of an episode. Instead, the agent receives penalty for the loss of a life. This should generalize across the full domain of games. Closing issue for now
the current behavior is to penalize the end of an episode to encourage the agent to prolong episodes (games). In cases where a game is never "solved" -- i.e. it can be played indefinitely -- this surely makes sense. However, in the case of Pong, the game is solved when either adversary achieves 20 points. if the agents wins, it is currently penalized. Should situations like this be addressed? Or, does this not matter too much in the long run?