RobertTLange / gymnax

RL Environments in JAX 🌍
Apache License 2.0
577 stars 54 forks source link

BernoulliBandit observation space bounds are incorrect when time normalisation is enabled. #41

Closed jaronsgit closed 1 year ago

jaronsgit commented 1 year ago

normalize_time: bool = True results in the number of steps being normalised between -1 and 1, while the observation space bounds are 0 and params.max_steps_in_episode = 100.

RobertTLange commented 1 year ago

Thank you so much @jaronsgit -- it is merged and will be part of the next release. Cheers, Rob