sail-sg / envpool

C++-based high-performance parallel environment execution engine (vectorized env) for general RL environments.
https://envpool.readthedocs.io
Apache License 2.0
1.08k stars 99 forks source link

Atari option for repeat_action_probability #14

Closed brett-daley closed 2 years ago

brett-daley commented 2 years ago

The -v5 Gym Atari environments have sticky actions enabled by default (with repeat_action_probability=0.25, see here). This makes it impossible to replicate the original results from several key papers, especially the DQN Nature paper.

Would it be possible to add an option to the Atari environment options that lets the user change repeat_action_probability to a different value? I believe that internally this can be accomplished by forwarding the argument to either gym.make or the ALE constructor.

Trinkle23897 commented 2 years ago

I think there may be some misunderstanding:

In *NoFrameskip-v4 setting with openai wrappers: https://github.com/openai/gym/blob/0.18.3/gym/envs/__init__.py#L656 and https://github.com/openai/gym/blob/0.18.3/gym/envs/atari/atari_env.py#L33 You can see the default value of repeat_action_probability is set to 0.

Currently, EnvPool's atari implementation follows the openai wrappers (that is to say we follow *NoFrameskip-v4 setting) instead of the latest gym version (*-v5), and as you can see https://github.com/sail-sg/envpool/blob/c6fd55dc61e07e5e547623711ab35d6ac8af06ed/envpool/atari/atari_env.h#L115

Would it be possible to add an option to the Atari environment options that lets the user change repeat_action_probability to a different value?

Sure, will do.

brett-daley commented 2 years ago

Oh awesome! I didn't see that the default value is 0, but it would still be good to have an option to change it. Thanks for looking into it.

Trinkle23897 commented 2 years ago

And also here: https://github.com/mgbellemare/Arcade-Learning-Environment/blob/master/src/python/gym.py#L148 it is set to 0

brett-daley commented 2 years ago

Indeed. My confusion was because the environment names in your documentation (https://envpool.readthedocs.io/en/latest/api/atari.html#available-tasks) are called -v5, so I assumed that they followed the defaults for Gym -v5 instead of *NoFrameSkip-v4. Thank you for clarifying.

Trinkle23897 commented 2 years ago

That's indeed an issue. Do you have any better idea instead of posting something like "our implementation is not equal to gym's *-v5" on the top of Atari env API documentation?

brett-daley commented 2 years ago

Maybe you could just drop the Gym-like versioning and just call them Adventure, AirRaid, etc?

Of course, that would mean that you can't change the default arguments in future releases (without unexpectedly changing behavior). However, since people can change the values themselves anyway, I don't think that's a huge issue. For example, if I do envpool.make('Pong-v5', env_type='gym', repeat_action_probability=0.9), then it's already not equivalent to Pong-v5, so the name didn't matter in the first place.

If you are concerned about future compatibility though, then maybe you could have your own naming convention that doesn't clash with Gym. Something like AdventureEnvPool-v0, etc. where it's clear that your versions aren't related to Gym.

Trinkle23897 commented 2 years ago

https://github.com/sail-sg/envpool/pull/15 it's here now