Closed artofbeinghuman closed 5 years ago
The problem seems to be, that the agent doesn't get his last life discounted in info = {'ale.lives': 1}
when freezing to death. So since at no point info == {'ale.lives': 0}
, the environment also doesn't return done = True
. However, if the agent dies by drowing (if you supply the down action at every step) then upon dying in his last life, ale.lives is set to 0 and done = True
is returned.
Can anybody with a bit more experience say, if this has to be fixed in gym or is it actually a problem in the underlying ALE?
Thanks, Marvin
It looks like gym just calls game_over
which calls isTerminal
on the environment (https://github.com/mgbellemare/Arcade-Learning-Environment/blob/f7fff8733c8cc0f54d749ddeaf29bd7f478d6f0f/src/games/supported/Frostbite.cpp#L61). This certainly looks like a bug, just not a bug in gym, could you please file it on the ALE repo? https://github.com/mgbellemare/Arcade-Learning-Environment/issues
Hello, I have stumbled upon a peculiar thing with the
FrostbiteNoFrameskip-v4
environment. Consider the following code snippet, where I run theenv
indefinitely, at each step giving the 0-th action, which according toenv.get_action_meanings()
is NOOP, meaning the agent will do nothing.As expected the agent stands around doing nothing, until he freezes to death, upon which one life is deducted. This goes on until he runs out of lives. Then, it would be expected, that the final
env.step(0)
returns done=True, such that I can break from the game. However, this does not happen and instead the environment goes into a mode, which I can only describe as "Demo Play", like it would showcase the game in a video. I will add a screenshot of this. In this "Demo Mode" the agent stays indefinitely, dying several times, without losing lives and while also not gaining any points.If we change the above toy example and let the agent go downwards all the time (
env.step(5)
), then upon dying, the environment sendsdone=True
and the script quits the while loop successfully.What is going on?
Best, Marvin