Closed eliork closed 3 years ago
realize that after this line self.observation doesn't change. is that an expected behaviour?
why should it change? with VecEnv, the reset is automatic (cf doc).
shouldn't self.obs
hold the current observation? isn't that the expected output from the self.env.step(clipped_actions)
function?
shouldn't self.obs hold the current observation?
self.obs
holds the new observation after stepping in the env.
I found the problem, I guess I configured the observation space wrong. I configured it as
self.observation_space = spaces.Box(low=np.finfo(np.float32).min,
high=np.finfo(np.float32).max,
shape=(1, self.z_size + self.n_commands * self.n_command_history),
**dtype=np.uint8)**
instead of having dtype=np.float32
Thank you for your help, closing this issue
https://github.com/araffin/learning-to-drive-in-5-minutes/blob/ccb27e66d593d6036fc1076dcec80f74a3f5e239/algos/custom_ppo2.py#L157
Hi, I am trying to recreate your approach in a custom environment I have built. I am trying to use your custom version of PPO2, but I realize that after this line self.observation doesn't change. is that an expected behaviour?
Thanks, appreciate your work