Closed aaronsnoswell closed 4 years ago
I'm using RLlib grid search with TF+Keras and I get a similar problem with the same root cause. I can't use a Tuple with Discrete in my observable space.
Hello! Sorry for the long delay.
Indeed: there is not support for discrete observation spaces, and there is no plan to implement it. A simple thing you could do is wrap your environment with something that converts the integer observation into a one-hot vector. Then things should work fine.
Summary: I've noticed that the Spinning Up algorithm implementations don't seem to support discrete observation spaces defined with
gym.spaces.Discrete
.Steps to reproduce:
python -m spinup.run ppo --env FrozenLake-v0
Alternatively, try any other algorithm, and/or any other gym environment that uses
gym.spaces.Discrete
for it's observation space (e.g. any of the Algorithms or Toy Text family).Expected result: Should run and train a PPO policy for the FrozenLake task.
Observed result:
This error seems to be a problem with a tf.placeholder when constructing the policy network.
Notes: Interestingly, it seems that discrete action spaces are fine (e.g. I can train policies for the MountainCar task).
My understanding is that policy gradient methods in general should support discrete observation spaces.