openai / gym

A toolkit for developing and comparing reinforcement learning algorithms.
https://www.gymlibrary.dev
Other
34.57k stars 8.59k forks source link

Is there a comprehensive discription about every Envs except WIKI? #1699

Closed JiyueWang closed 4 years ago

JiyueWang commented 5 years ago

I am trying to understand atari game envs, but all I can do is to print or plot some samples of states, rewards, and actions. The information in Wiki is not enough. Is there a more straight forward way?

christopherhesse commented 4 years ago

You are able to save observations, rewards, and actions and plot them. I'm uncertain what the question is here, could you rephrase it?

JiyueWang commented 4 years ago

Thank you for your reply. What I expect is a description like the cart pole env in this link https://github.com/openai/gym/wiki/CartPole-v0 for every game. On the other hand, there are different versions of a single game such as pong: Pong-ram-v0 Pong-ram-v4 Pong-ramDeterministic-v0 Pong-ramDeterministic-v4 Pong-ramNoFrameskip-v0 Pong-ramNoFrameskip-v4 Pong-v0 Pong-v4 PongDeterministic-v0 PongDeterministic-v4 PongNoFrameskip-v0 PongNoFrameskip-v4 I don't know how to deal with them.

christopherhesse commented 4 years ago

You can ignore the v0 version of those, "ram" indicates you get ram observations instead of images. Beyond that, there's "deterministic", "noframeskip", and "". It would be nice if those variants were documented on the wiki.

Looks like noframeskip is the least altered https://github.com/openai/gym/blob/c33cfd8b2cc8cac6c346bc2182cd568ef33b8821/gym/envs/__init__.py#L654 where you get each frame.

"Deterministic" looks like you get frameskip (this is easy to do yourself so no need to use that mode)

"" looks like you get a random frameskip between 2 and 5.

Everyone should likely use the "noframeskip" version and use their own frameskip wrapper if they want it.

Please feel free to update the wiki with this information.

JiyueWang commented 4 years ago

Thanks for your reply. I finally find some information about the variants at the end of this link and it recommends chapter2 and chapter5 in this article: Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents