Closed peiseng closed 4 years ago
Hello, Please fill the issue template completely.
You forgot to give the complete traceback... it is due to the eval env which is not properly wrapped. See the warning:
stable_baselines/common/callbacks.py:277: UserWarning: Training and eval env are not of the same type<stable_baselines.her.utils.HERGoalEnvWrapper object at 0x7f68c48c09e8> != <stable_baselines.common.vec_env.dummy_vec_env.DummyVecEnv object at 0x7f68b00f7908>
"{} != {}".format(self.training_env, self.eval_env))
I would appreciate a PR that solves this issue.
I'm having the same issue with the parking-v0
environment.
python train.py --algo her --env parking-v0 -n 10000
It returns the same error:
../stable-baselines/stable_baselines/common/callbacks.py:280: UserWarning: Training and eval env are not of the same type<stable_baselines.her.utils.HERGoalEnvWrapper object at 0x7f68828e7be0> != <stable_baselines.common.vec_env.dummy_vec_env.DummyVecEnv object at 0x7f686266ea58>
"{} != {}".format(self.training_env, self.eval_env))
Traceback (most recent call last):
File "train.py", line 411, in <module>
model.learn(n_timesteps, **kwargs)
File "../stable-baselines/stable_baselines/her/her.py", line 113, in learn
replay_wrapper=self.replay_wrapper)
File "../stable-baselines/stable_baselines/sac/sac.py", line 416, in learn
if callback.on_step() is False:
File "../stable-baselines/stable_baselines/common/callbacks.py", line 90, in on_step
return self._on_step()
File "../stable-baselines/stable_baselines/common/callbacks.py", line 166, in _on_step
continue_training = callback.on_step() and continue_training
File "../stable-baselines/stable_baselines/common/callbacks.py", line 90, in on_step
return self._on_step()
File "../stable-baselines/stable_baselines/common/callbacks.py", line 298, in _on_step
return_episode_rewards=True)
File "../stable-baselines/stable_baselines/common/evaluation.py", line 38, in evaluate_policy
action, state = model.predict(obs, state=state, deterministic=deterministic)
File "../stable-baselines/stable_baselines/sac/sac.py", line 527, in predict
vectorized_env = self._is_vectorized_observation(observation, self.observation_space)
File "../stable-baselines/stable_baselines/common/base_class.py", line 723, in _is_vectorized_observation
.format(", ".join(map(str, observation_space.shape))))
ValueError: Error: Unexpected observation shape () for Box environment, please use (18,) or (n_env, 18) for the observation shape.
Wrapping the eval_env with HERGoalEnvWrapper in callback.py gets rid of the warning
import stable_baselines
...
self.eval_env = stable_baselines.her.HERGoalEnvWrapper(self.eval_env)
But then all sorts of issues about the observation dimension come up so I'm not sure this is the right way to go about it. For example after this modification, it's now necessary to flatten the observation (list of list) in ddpg.py (or any other algorithm used with HER).
observation = np.array([item for sublist in observation for item in sublist])
Or I also need to flatten obs_dict in utils.py
if len(obs_dict[KEY_ORDER[0]].shape) == 2:
for key in KEY_ORDER:
obs_dict[key] = obs_dict[key][0]
I haven't solved this but I'm just commenting what I tried in case this is of any help.
For example after this modification, it's now necessary to flatten the observation
There is a method for that in the HERGoalWrapper
normally. I don't have the time this week to have a deeper look into it. For now you can deactivate the evaluation passing --eval-freq -1
unless you solve the issue, in that case, I would appreciate a PR ;)
Ok I'll give it a try. But I will need to create a PR for Stable Baselines as well since I'm changing HERGoalWrapper
Hi I am getting this error while running this code:
It happens when the time-steps=10000
May I know how do i solve it?