hill-a / stable-baselines

A fork of OpenAI Baselines, implementations of reinforcement learning algorithms
http://stable-baselines.readthedocs.io/
MIT License
4.16k stars 725 forks source link

Box observation_space high bound wrongly set by wrap_deepmind #829

Open alexpalms opened 4 years ago

alexpalms commented 4 years ago

wrap_deepming function, depending on user inputs, could call first ScaledFloatFrame and then FrameStack, in this case the upper bound of the box observation space is set to 255 by FrameStack, while it should be 1.0

A possible solution could be to call ScaledFloatFrame at the end, even if this means getting rid of LazyFrames.

araffin commented 4 years ago

Hello, That's true. In practice, this wrapper is not used and the scaling is done during the pre-processing. We may remove that wrapper as it is also now present in gym.

alexpalms commented 4 years ago

Yep, I see normalization is performed in the observation_input method indeed, but if someone uses the wrapper with scale=True kwarg that's an issue. Thanks for the feedback.