openai / gym

A toolkit for developing and comparing reinforcement learning algorithms.
https://www.gymlibrary.dev
Other
34.57k stars 8.59k forks source link

"Lining up" action spaces in Atari #743

Closed atgambardella closed 5 years ago

atgambardella commented 6 years ago

Hi, I've noticed that the actions in the different Atari envs have different effects, such that the 3rd action means "left" in some envs, and "right" in others. What I would like to do is be able to instantiate Atari envs so that the action space is always 18, and ideally that these 18 actions will always execute an action as if it were done using a real atari joystick (so that, in Pong if you executed the diagonal-up-right action, it would send the paddle up). Is there an easy way to do this?

Edit: for what it's worth I was able to achieve this effect by changing self._action_set = self.ale.getMinimalActionSet() to self._action_set = self.ale.getLegalActionSet() in atari_env.py and then pip installing again, but I was wondering if there's a cleaner way.

DennisSoemers commented 6 years ago

I suspect it should be possible to change this after creating the environment too from your own code, using something like this:

env = gym.make(env_name)
unwrapped = env.unwrapped
unwrapped ._action_set = unwrapped.ale.getLegalActionSet()
unwrapped.action_space = spaces.Discrete(len(unwrapped._action_set))

This is obviously still not ideal (and I also didn't verify if it works correctly yet). I do think it's important to add official support for this, since the paper "Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents" recommends doing exactly this (a bunch of other recommendations from that paper are also difficult to implement without hacky workarounds currently)

christopherhesse commented 5 years ago

You should not use gym.make() in this case, just create the AtariEnv directly:

from gym.envs.atari.atari_env import AtariEnv
env = AtariEnv(game=game, obs_type='image', frameskip=1, full_action_space=True)

This will give you the same actions on almost all games, except the game skiing which has a smaller action space (perhaps to prevent people from pausing the game).