hill-a / stable-baselines

A fork of OpenAI Baselines, implementations of reinforcement learning algorithms
http://stable-baselines.readthedocs.io/
MIT License
4.16k stars 725 forks source link

What is the role of the lower and higher bound in the Box for the observation space? #1104

Closed outdoteth closed 3 years ago

outdoteth commented 3 years ago

For example, I have created an environment with the following observation space:

self.observation_space = spaces.Box(
    low=-np.inf, # Talking about this line
    high=np.inf, # Talking about this line
    shape=(CHANNELS, days_lookback * CANDLE_TIMEFRAME), 
    dtype=np.uint8
)

Is it ok to set the low and high to be np.inf? If my actual bound is between 1 and 0 will it make a difference if I use -np.inf and np.inf?

araffin commented 3 years ago

Hello,

Is it ok to set the low and high to be np.inf? If my actual bound is between 1 and 0 will it make a difference if I use -np.inf and np.inf?

yes, the bounds are only taken into account when sampling from it or when it is related to the action space. So, it won't make any difference if you specify -inf and inf as bounds, however, you should still have normalized value if possible.

outdoteth commented 3 years ago

Cool thanks. Yes for some of the values in my observation I'm using z-score standardization to normalize but for others I'm using min-max normalization which is why I asked the question. The z-score normalized values will be between -np.inf or np.inf but the min-max values will be between 0 and 1.