openai / baselines

OpenAI Baselines: high-quality implementations of reinforcement learning algorithms
MIT License
15.7k stars 4.87k forks source link

Why Is Input Shape (111,) for Ant-V2? #1124

Open ryanmaxwell96 opened 4 years ago

ryanmaxwell96 commented 4 years ago

I don't understand why. I think it should be either 27 or 29. But in models.py I see that in network_fn in def mlp, input_shape = 111.

szczekulskij commented 4 years ago

I think, that you might get 27 variables passed, some of them are single numbers [ scalars ], while some of them are vectors [ described using lists ]. So at the end when you count up the single numbers + the numbers from vectors you end up with number much bigger than 27.

Have a look at humanoid example : https://github.com/openai/gym/wiki/Humanoid-V1 Observarations from 1st to 21st are single and straightforward, while the next ones are vectors [ lists ].

"The OpenAI gym environment hides first 2 dimensions of qpos returned by MoJoCo. They correspond to x and y coordinate of the robot root (abdomen). The reason is this quantity can grow boundlessly and their absolute value does not carry any significance. If you want to obtain the full state just use env.env.state_vector() and you will get a 47-dimensional vector containing qpos and qvel. refer to mujoco_env.py. All the joint angles are in radians."

while "Type: Box(376) "