sail-sg / envpool

C++-based high-performance parallel environment execution engine (vectorized env) for general RL environments.
https://envpool.readthedocs.io
Apache License 2.0
1.1k stars 100 forks source link

[BUG] Atari Breakout does not reset with episodic_live=True #301

Open LabChameleon opened 6 months ago

LabChameleon commented 6 months ago

Describe the bug

Atari Breakout does not reset with env.reset() when episodic_live=True. This can be observed, for example, by the number of lives decreasing even though no actual steps are being taken. I also generated videos of the episodes, and it can also be seen that the environment is not reset but appears to perform noop for the agent. It behaves as expected with episodic_live=False.

Pong and Assault show similar behaviour where it can be seen that the environment does not reset.

To Reproduce

The following code reproduces the problem:

import envpool
env = envpool.make("Breakout-v5", env_type="gymnasium", num_envs=1, seed=42, episodic_life=True)
for _ in range(1000):
    obs, info = env.reset()
    assert info["lives"] == 5, f"info['lives'] is {info['lives']}"

This results in

Traceback (most recent call last):
  File "xxx/envpool_reset_bug.py", line 8, in <module>
    assert info["lives"] == 5, f"info['lives'] is {info['lives']}"
AssertionError: info['lives'] is [4]

Expected behavior

I expect the environment to be reset and the number of lives to be therefore 5 all the time.

System info

import envpool, numpy, sys
print(envpool.__version__, numpy.__version__, sys.version, sys.platform)
0.8.4 1.26.4 3.10.4 (main, Oct 18 2023, 19:39:07) [GCC 11.3.0] linux

Checklist