Closed yeshenpy closed 3 years ago
Hi, many thanks for your question. Entries 0-2 correspond to global x,y,z position and the subsequent 2 encode global planar velocity.
Thanks for your reply. Actually, I have a little question about the position of x,y,z. I trained an agent to get about 3000 rewards, .Velocity and position are recorded. (The index of 0,1,8,9,10 of state) Where the first two figures are about the velocities, the last three figures are the position of x,y,z. Intuitively, if the velocity is positive, then the position should be monotonically increasing. But as the figure above shows, the results are not as expected. Could there be some explanation for this?
Thank you very much for your benchmark. Here I have some questions to ask: The specific meaning of the state. For example in 2-agent HalfCheetah, https://github.com/openai/gym/blob/master/gym/envs/mujoco/assets/half_cheetah.xml the state has 17 dimensions. Each agent's obs has 6 dimensions. There are still 5 dimensions in the state. Although looking at the official XML, I still don't know exactly what these 5 dimensions mean. Hope to get your reply, Thanks.