Closed shanyou92 closed 5 years ago
I'm not sure where you see 6. I see 4:
>>> import gym
>>> gym.version.VERSION
'0.8.1'
>>> env=gym.make('Breakout-v0')
[2017-05-14 16:04:19,806] Making new env: Breakout-v0
>>> env
<TimeLimit<AtariEnv<Breakout-v0>>>
>>> env.action_space
Discrete(4)
>>> env.unwrapped.get_action_meanings()
['NOOP', 'FIRE', 'RIGHT', 'LEFT']
Hi, @tlbtlbtlb It seems that a higher version of gym has a different definition?
import gym gym.version.VERSION '0.8.2' env=gym.make('Breakout-v0') [2017-05-15 11:34:41,915] Making new env: Breakout-v0 env.action_space Discrete(6) env.unwrapped.get_action_meanings() ['NOOP', 'FIRE', 'RIGHT', 'LEFT', 'RIGHTFIRE', 'LEFTFIRE']
For Breakout, 4 makes more sense because the fire button is only used to start the game. A learning agent will learn substantially faster without the extra actions. The difference in action space would be between versions of atari-py, not gym. You should have be using atari-py 0.1.1.
@tlbtlbtlb @JustinYou1992 as per my observation, Fire button is required every-time the agent loses a life.And, in the game of Breakout , the agent has 5 lives in a episodes as indicated in the "info".Also, the ball will simply disappear if we don't press "Fire" after the loss of a life.
This observation can be easily replicated by :
import gym import random
env = gym.make('Breakout-v0') [2017-05-16 21:38:47,888] Making new env: Breakout-v0 observation = env.reset() , , done, = env.step(1) # Fire while not done: ... env.render() ... , , done, = env.step(random.choice([2, 3])) # Right, Left ...
In this case, the episode never terminates.
For simple quick learning, old actions space used to give a scope of using "RightFire","LeftFire". This way "Fire" button was not needed.
That seems like a slight problem for hand-written agents, but a learning agent (the goal of gym) should be able to figure it out.
@koulanurag @tlbtlbtlb In this way, how do we know which mode of game is used in current most papers? It seems unfair if different papers use different game configurations but report the final scores as usual...😂
One of the motivations for gym is to release environments that are at least consistent between researchers. We're planning to release, soon, wrappers that (we think) make them identical to the published work from DeepMind. I believe their results use a wrapper that clicks the fire button automatically to start each game. @shelhamer , do you know what action space they use for Breakout?
what action space they use for Breakout?
The original DQN and the Nature edition of the DQN made use of the xitari fork of ALE, which has this minimal action space for Breakout: https://github.com/deepmind/xitari/blob/master/games/supported/Breakout.cpp#L88-L91. Note that this is the same action space for Breakout as in the new v4 Atari gym environment: https://github.com/openai/atari-py/blob/master/atari_py/ale_interface/src/games/supported/Breakout.cpp#L83-L86.
@shelhamer Thanks. I find when I update to atari-py-0.0.21, every version of Breakout in gym has 4 actions, including v0 and v4. But it is still kind of annoying since Breakout is just a single game, and we have at total 49+ games for research, which we have no idea whether they are also inconsistent with ALE environment with respect to #actions.
Can we get rid of timelimits as well cause for a few Atari games definitely limits performance. Time limit I assume is to aid in training but we can limit that in training ourselves and lift limit when performance gets great enough that it's a handicap
It sounds like the original issue here is fixed, please file a new issue if that is not the case.
@JustinYou1992 you can disable the minimum action with something like AtariEnv(game=game, obs_type='image', frameskip=1, full_action_space=True)
. There's still like 1 atari game with a different action space (skiing
?) but the rest should all be the same
@dgriff777 You can create your environment using AtariEnv
and there should be no timelimit.
Hi, I am using the gym(atari) and ALE at the same time. And I am playing atari games, for example breakout.
In gym, it's "breakout-v0", and it has 6 actions. But in ALE, using "breakout.bin" has 4 actions.
This is the same game????