Closed camontblanc closed 5 years ago
That's because in the observations we represent the angle of each pole by its sine and cosine. These are more convenient as inputs to neural networks than the angle itself because they are bounded between -1 and 1, and there are no discontinuities when the pole spins > 360° in either direction.
Here are the relevant bits of code: https://github.com/deepmind/dm_control/blob/master/dm_control/suite/cartpole.py#L202-L207 https://github.com/deepmind/dm_control/blob/master/dm_control/suite/cartpole.py#L150-L153
Hi!
According to the tech report, the Cart-k-pole has:
two_pole
non-benchmarking task and I imagine the state corresponds to: position and velocity of the cart and the two poles (we have 6 as dim(S)). However, Why do we have 2 mores observations?Thanks by the way!