hill-a / stable-baselines

A fork of OpenAI Baselines, implementations of reinforcement learning algorithms
http://stable-baselines.readthedocs.io/
MIT License
4.16k stars 725 forks source link

observation_space problem #1057

Closed asd3200asd closed 3 years ago

asd3200asd commented 3 years ago

foo Gym is my customized Gym environment There is a problem when setting the observation_space

There is an error in my program ValueError: Cannot feed value of shape (1,) for Tensor 'deepq/input/Ob:0', which has shape '(?, 5)'

I think my observation_space set format is the same as CartPole-v1

================================== and there are CartPole-v1 environmental's observation_space

    # Angle at which to fail the episode
    self.theta_threshold_radians = 12 * 2 * math.pi / 360
    self.x_threshold = 2.4

    # Angle limit set to 2 * theta_threshold_radians so failing observation
    # is still within bounds.
    high = np.array([self.x_threshold * 2,
                     np.finfo(np.float32).max,
                     self.theta_threshold_radians * 2,
                     np.finfo(np.float32).max],
                    dtype=np.float32)

    self.action_space = spaces.Discrete(2)
    self.observation_space = spaces.Box(-high, high, dtype=np.float32)

================================== and there are environmental programs's observation_space

    self.action_space = spaces.Discrete(10)
    low =  np.array([0,-90,-180,  0, 0])
    high = np.array([300,90,180,  5, 5])

    self.observation_space = spaces.Box(low=np.array(low), high=np.array(high), dtype=np.float32)

================================== and there are main programs (stable_baselines dqn example,Only change foo Gym )

   import gym
   import gym_foo
   from stable_baselines.common.vec_env import DummyVecEnv
   from stable_baselines.deepq.policies import MlpPolicy
   from stable_baselines import DQN, PPO2, A2C, ACKTR

   env = gym.make('foo-v0')
   print("Observation space:", env.observation_space)
   print("Shape:", env.observation_space.shape)

   model = DQN(MlpPolicy, env, verbose=1)
   model.learn(total_timesteps=250)
   model.save("deepq_cartpole")

   del model # remove to demonstrate saving and loading
   model = DQN.load("deepq_cartpole")
   obs = env.reset()
   while True:
       action, _states = model.predict(obs)
       obs, rewards, dones, info = env.step(action)
       env.render()
Miffyli commented 3 years ago

Check docs on custom environments, especially the use of check_env utility to make sure that your environment works as expected.

asd3200asd commented 3 years ago

The following program does not run in my program

from stable_baselines.common.env_checker import check_env
env = FooEnv()
# It will check your custom environment and output additional warnings if needed
check_env(env)

Is the stable baselines's environment different from the gym's environment?

Miffyli commented 3 years ago

No, stable-baselines uses the same API as presented in OpenAI Gym. check_env attempts to check that the environment follows the API. If you are getting an error with that code, it likely means there is something off in your environment (check the error message).