schroederdewitt / multiagent_mujoco

Benchmark for Continuous Multi-Agent Robotic Control, based on OpenAI's Mujoco Gym environments.
Apache License 2.0
334 stars 34 forks source link

Meaning of variables in the environment #9

Closed yeshenpy closed 3 years ago

yeshenpy commented 3 years ago

Thank you very much for your benchmark. Here I have some questions to ask: The specific meaning of the state. For example in 2-agent HalfCheetah, https://github.com/openai/gym/blob/master/gym/envs/mujoco/assets/half_cheetah.xml the state has 17 dimensions. Each agent's obs has 6 dimensions. There are still 5 dimensions in the state. Although looking at the official XML, I still don't know exactly what these 5 dimensions mean. Hope to get your reply, Thanks.

schroederdewitt commented 3 years ago

Hi, many thanks for your question. Entries 0-2 correspond to global x,y,z position and the subsequent 2 encode global planar velocity.

yeshenpy commented 3 years ago

Thanks for your reply. Actually, I have a little question about the position of x,y,z. I trained an agent to get about 3000 rewards, .Velocity and position are recorded. (The index of 0,1,8,9,10 of state) image image image image image Where the first two figures are about the velocities, the last three figures are the position of x,y,z. Intuitively, if the velocity is positive, then the position should be monotonically increasing. But as the figure above shows, the results are not as expected. Could there be some explanation for this?