Closed vwxyzjn closed 2 years ago
Actually this happens to other envs:
import envpool
import numpy as np
from atari_data import atari_human_normalized_scores
num_envs = 1
for env_id in envpool.list_all_envs():
# filter out non atari envs
if "v5" not in env_id:
continue
envs = envpool.make(
env_id,
env_type="gym",
num_envs=num_envs,
episodic_life=True,
reward_clip=True,
)
envs.reset()
has_lives = False
info = envs.step(np.zeros(num_envs, dtype=int))[-1]
if info["lives"].sum() > 0:
has_lives = True
else:
continue
envs.reset()
for i in range(10000):
_, _, next_done, info = envs.step(np.random.randint(0, envs.action_space.n, num_envs))
if info["lives"].sum() == 0:
# print(f"step={i}, lives is", info["lives"].sum())
break
# take an additional step after all lives are exhausted
_, _, next_done, info = envs.step(np.random.randint(0, envs.action_space.n, num_envs))
if info["lives"].sum() == 0:
print(f"{env_id} does not reset the life counter")
Atlantis-v5 does not reset the life counter
Atlantis2-v5 does not reset the life counter
BankHeist-v5 does not reset the life counter
DonkeyKong-v5 does not reset the life counter
Frogger-v5 does not reset the life counter
KeystoneKapers-v5 does not reset the life counter
LostLuggage-v5 does not reset the life counter
MrDo-v5 does not reset the life counter
SirLancelot-v5 does not reset the life counter
Turmoil-v5 does not reset the life counter
WizardOfWor-v5 does not reset the life counter
You cannot instantly check if lives in t == 0 and lives in t+1 > 0: https://github.com/openai/baselines/blob/ea25b9e8b234e6ee1bca43083f8f3cf974143998/baselines/common/atari_wrappers.py#L76-L80
There's a bug in this line:
https://github.com/sail-sg/envpool/blob/ea86c2b77d12aaa58725bfeb1d701e3207f11822/envpool/atari/atari_env.h#L202
should be 0 < env_->lives() && env_->lives() < lives_
I'll make a PR to fix this.
Thank you! Also a related feature request - would it be possible to have a variable in the info section indicating if the environment is terminated?
import envpool
import numpy as np
num_envs = 1
for env_id in envpool.list_all_envs():
# filter out non atari envs
if "v5" not in env_id:
continue
envs = envpool.make(
env_id,
env_type="gym",
num_envs=num_envs,
episodic_life=True,
reward_clip=True,
)
envs.reset()
has_lives = False
info = envs.step(np.zeros(num_envs, dtype=int))[-1]
if info["lives"].sum() > 0:
has_lives = True
else:
continue
envs.reset()
for i in range(20000):
_, _, done, info = envs.step(np.random.randint(0, envs.action_space.n, num_envs))
if info["lives"].sum() == 0:
break
if info["lives"].sum() > 0:
print(f"{env_id} steps too long")
continue
# for normal atari (e.g., breakout)
# take an additional step after all lives are exhausted
_, _, next_done, info = envs.step(np.random.randint(0, envs.action_space.n, num_envs))
if done and info["lives"].sum() > 0:
print(f"{env_id} pass check")
continue
assert not done
for i in range(2000):
_, _, done, info = envs.step(np.random.randint(0, envs.action_space.n, num_envs))
if done:
break
_, _, next_done, info = envs.step(np.random.randint(0, envs.action_space.n, num_envs))
if info["lives"].sum() > 0:
print(f"{env_id} pass check in {i} steps")
else:
print(f"{env_id} does not reset the life counter")
Result after bug fix:
Adventure-v5 steps too long
AirRaid-v5 pass check
Alien-v5 pass check
Amidar-v5 pass check
Assault-v5 pass check
Asterix-v5 pass check
Asteroids-v5 pass check
Atlantis-v5 pass check in 146 steps
Atlantis2-v5 pass check in 30 steps
Backgammon-v5 steps too long
BankHeist-v5 pass check in 64 steps
BasicMath-v5 pass check
BattleZone-v5 pass check
BeamRider-v5 pass check
Berzerk-v5 pass check
Blackjack-v5 pass check
Breakout-v5 pass check
Casino-v5 pass check
Centipede-v5 pass check
ChopperCommand-v5 pass check
CrazyClimber-v5 pass check
Crossbow-v5 pass check
Darkchambers-v5 steps too long
Defender-v5 pass check
DemonAttack-v5 pass check
DonkeyKong-v5 pass check in 215 steps
Earthworld-v5 steps too long
ElevatorAction-v5 steps too long
Entombed-v5 pass check
Et-v5 pass check in 1371 steps
FlagCapture-v5 pass check
Frogger-v5 pass check in 55 steps
Frostbite-v5 pass check
Galaxian-v5 pass check
Gopher-v5 pass check
Gravitar-v5 pass check
Hangman-v5 pass check
HauntedHouse-v5 pass check
Hero-v5 pass check
HumanCannonball-v5 pass check
Jamesbond-v5 pass check
Kaboom-v5 pass check
Kangaroo-v5 pass check
KeystoneKapers-v5 pass check in 1164 steps
KingKong-v5 pass check
Klax-v5 pass check
Krull-v5 pass check
KungFuMaster-v5 pass check
LostLuggage-v5 pass check in 120 steps
MarioBros-v5 pass check
MiniatureGolf-v5 steps too long
MontezumaRevenge-v5 pass check
MrDo-v5 pass check in 74 steps
MsPacman-v5 pass check
NameThisGame-v5 pass check
Othello-v5 pass check
Pacman-v5 pass check
Phoenix-v5 pass check
Pitfall-v5 pass check
Pitfall2-v5 steps too long
Pooyan-v5 pass check
Qbert-v5 pass check
Riverraid-v5 pass check
RoadRunner-v5 pass check
Robotank-v5 pass check
Seaquest-v5 pass check
SirLancelot-v5 pass check in 38 steps
Solaris-v5 pass check
SpaceInvaders-v5 pass check
SpaceWar-v5 steps too long
StarGunner-v5 pass check
Superman-v5 steps too long
Surround-v5 pass check
TicTacToe3d-v5 pass check
TimePilot-v5 pass check
Trondead-v5 pass check
Turmoil-v5 pass check in 160 steps
Tutankham-v5 pass check
UpNDown-v5 pass check
Venture-v5 pass check
VideoCheckers-v5 pass check
VideoChess-v5 steps too long
VideoCube-v5 steps too long
VideoPinball-v5 pass check
WizardOfWor-v5 pass check in 15 steps
WordZapper-v5 pass check
YarsRevenge-v5 pass check
Zaxxon-v5 pass check
Thank you! Also a related feature request - would it be possible to have a variable in the info section indicating if the environment is terminated?
Sure! Do you have any idea about the naming?
How about info['terminated']
?
What does that even mean though that you can continue stepping into the environment when all the lives are exhausted?
What does that even mean though that you can continue stepping into the environment when all the lives are exhausted?
Yes.
import gym
e = gym.make("AtlantisNoFrameskip-v4")
e.reset()
for t in range(10000):
o, r, d, i = e.step(0)
print(t, d, i)
if d:
break
1673 False {'ale.lives': 0}
1674 False {'ale.lives': 0}
1675 False {'ale.lives': 0}
1676 False {'ale.lives': 0}
1677 False {'ale.lives': 0}
1678 False {'ale.lives': 0}
1679 False {'ale.lives': 0}
1680 False {'ale.lives': 0}
1681 False {'ale.lives': 0}
1682 False {'ale.lives': 0}
1683 False {'ale.lives': 0}
1684 False {'ale.lives': 0}
1685 False {'ale.lives': 0}
1686 False {'ale.lives': 0}
1687 False {'ale.lives': 0}
1688 False {'ale.lives': 0}
1689 False {'ale.lives': 0}
1690 True {'ale.lives': 0}
Describe the bug
When all the lives are exhausted in
Atlantis-v5
, making an additional step does not reset the life counter, whereas inBreakout-v5
it does. I am not sure if this is something particular with the Atlantis environment though.To Reproduce
Steps to reproduce the behavior.
Please try to provide a minimal example to reproduce the bug. Error messages and stack traces are also helpful.
Please use the markdown code blocks for both code and stack traces.
Expected behavior
If all the lives are exhausted, making an additional step should reset the life counter.
System info
Describe the characteristic of your environment:
Reason and Possible fixes
Maybe this is
Checklist