Document Gym environment requirements by algorithm

Some or all of the algorithms seem to have requirements on which gym environments they work with.

There is a rich variety of environments, with some aging information on some of them from a few years ago as documented in https://github.com/openai/gym/issues/106 and at Table of environments · openai/gym Wiki

As noted e.g. in #132 "Valid Gym environments to use",

FetchReach environment has Dict observation space (because it packages not only arm position, but also the target location into the observation), and spinning up does not implement support for Dict observation spaces yet.

In another random environment I found elsewhere on GitHub (Banana-v0), I got this error from ppo:

  File "/srv/s/aima/spinningup/spinningup/spinup/algos/ppo/ppo.py", line 256, in ppo
    a, v_t, logp_t = sess.run(get_action_ops, feed_dict={x_ph: o.reshape(1,-1)})         
AttributeError: 'list' object has no attribute 'reshape'

because it returns a list, not an array of observations.

So it would help to have documentation on what the algorithms require now, and lists of what outstanding capabilities could be easily integrated (as described in that issue about FetchReach) as low-hanging fruit here for contributions.

openai / spinningup

Document Gym environment requirements by algorithm #143