Atari Breakout does not reset with env.reset() when episodic_live=True. This can be observed, for example, by the number of lives decreasing even though no actual steps are being taken. I also generated videos of the episodes, and it can also be seen that the environment is not reset but appears to perform noop for the agent. It behaves as expected with episodic_live=False.
Pong and Assault show similar behaviour where it can be seen that the environment does not reset.
To Reproduce
The following code reproduces the problem:
import envpool
env = envpool.make("Breakout-v5", env_type="gymnasium", num_envs=1, seed=42, episodic_life=True)
for _ in range(1000):
obs, info = env.reset()
assert info["lives"] == 5, f"info['lives'] is {info['lives']}"
This results in
Traceback (most recent call last):
File "xxx/envpool_reset_bug.py", line 8, in <module>
assert info["lives"] == 5, f"info['lives'] is {info['lives']}"
AssertionError: info['lives'] is [4]
Expected behavior
I expect the environment to be reset and the number of lives to be therefore 5 all the time.
Describe the bug
Atari Breakout does not reset with
env.reset()
whenepisodic_live=True
. This can be observed, for example, by the number of lives decreasing even though no actual steps are being taken. I also generated videos of the episodes, and it can also be seen that the environment is not reset but appears to perform noop for the agent. It behaves as expected withepisodic_live=False
.Pong and Assault show similar behaviour where it can be seen that the environment does not reset.
To Reproduce
The following code reproduces the problem:
This results in
Expected behavior
I expect the environment to be reset and the number of lives to be therefore 5 all the time.
System info
Checklist