Open gunnxx opened 2 years ago
We've changed things a bit from the paper in the code release. We use slightly different observations for the policy and discriminator. amp_obs are the observations for the discriminator, and these contain mainly features in reduced coordinates. We also stack the observations over multiple timesteps for the discriminator, which is why it's 1400D. The observations for the policy use features in maximum coordinates, which makes the features higher dimensional than those in the paper.
Hi, I would like to ask the difference between states, observations, and amp_observations.
My understanding is that the state space is not defined for the humanoid task and just the observation space is used. However, there is amp_obs in the extras which I don't know where it is being used. It is confusing because in the ASE paper, it says "Combined, these features result in a 120D state space". Is it a feature (ie. observation) or state? When I print the observation space of HumanoidAMPGetup, it says 253D for observation and 1400D for amp_observation.