[BUG] Atari Breakout does not reset with episodic_live=True

Describe the bug

Atari Breakout does not reset with env.reset() when episodic_live=True. This can be observed, for example, by the number of lives decreasing even though no actual steps are being taken. I also generated videos of the episodes, and it can also be seen that the environment is not reset but appears to perform noop for the agent. It behaves as expected with episodic_live=False.

Pong and Assault show similar behaviour where it can be seen that the environment does not reset.

To Reproduce

The following code reproduces the problem:

import envpool
env = envpool.make("Breakout-v5", env_type="gymnasium", num_envs=1, seed=42, episodic_life=True)
for _ in range(1000):
    obs, info = env.reset()
    assert info["lives"] == 5, f"info['lives'] is {info['lives']}"

This results in

Traceback (most recent call last):
  File "xxx/envpool_reset_bug.py", line 8, in <module>
    assert info["lives"] == 5, f"info['lives'] is {info['lives']}"
AssertionError: info['lives'] is [4]

Expected behavior

I expect the environment to be reset and the number of lives to be therefore 5 all the time.

System info

Describe how the library was installed: pip
Python version: 3.10.4
Versions of any other relevant libraries:
- envpool: 0.8.4
- jax: 0.4.26
- gymnasium: 0.29.1 (not sure if relevant)

import envpool, numpy, sys
print(envpool.__version__, numpy.__version__, sys.version, sys.platform)

0.8.4 1.26.4 3.10.4 (main, Oct 18 2023, 19:39:07) [GCC 11.3.0] linux

Checklist

[x] I have checked that there is no similar issue in the repo (required)
[x] I have read the documentation (required)
[x] I have provided a minimal working example to reproduce the bug (required)

sail-sg / envpool