openai / spinningup

An educational resource to help anyone learn deep reinforcement learning.
https://spinningup.openai.com/
MIT License
9.94k stars 2.19k forks source link

Document Gym environment requirements by algorithm #143

Open nealmcb opened 5 years ago

nealmcb commented 5 years ago

Some or all of the algorithms seem to have requirements on which gym environments they work with.

There is a rich variety of environments, with some aging information on some of them from a few years ago as documented in https://github.com/openai/gym/issues/106 and at Table of environments · openai/gym Wiki

As noted e.g. in #132 "Valid Gym environments to use",

FetchReach environment has Dict observation space (because it packages not only arm position, but also the target location into the observation), and spinning up does not implement support for Dict observation spaces yet.

In another random environment I found elsewhere on GitHub (Banana-v0), I got this error from ppo:

  File "/srv/s/aima/spinningup/spinningup/spinup/algos/ppo/ppo.py", line 256, in ppo
    a, v_t, logp_t = sess.run(get_action_ops, feed_dict={x_ph: o.reshape(1,-1)})         
AttributeError: 'list' object has no attribute 'reshape'

because it returns a list, not an array of observations.

So it would help to have documentation on what the algorithms require now, and lists of what outstanding capabilities could be easily integrated (as described in that issue about FetchReach) as low-hanging fruit here for contributions.

nealmcb commented 5 years ago

Another useful example, and commentary on ppo requiring Box (real-valued) observation vectors can be seen in #12 Test script failed in FetchPush-v1 and HandManipulatePen-v0.

See also #122.