google-deepmind / dm_control

Google DeepMind's software stack for physics-based simulation and Reinforcement Learning environments, using MuJoCo.
Apache License 2.0
3.63k stars 647 forks source link

How can we get detailed explanation of obvervation variables? #271

Open Frank-Dz opened 2 years ago

Frank-Dz commented 2 years ago

Hi~ Thanks for your great work. I am playing with the locomotion soccer environment. But I have no idea about the meaning of some observation variables: They look like the following:

        "team_goal_back_right",  # (1, 2)       [[20.39657422 -5.54516916]]
        "team_goal_mid",  # (1, 3)      [[ 11.73723908 -10.76243655   1.67126411]]
        "team_goal_front_left",  # (1, 2)       [[  3.07790393 -15.97970393]]
        "field_front_left",  # (1, 2)       [[-56.03425312  20.39720981]]
        "opponent_goal_back_left",  # (1, 2)        [[-49.91189073  34.54660858]]
        "opponent_goal_mid",  # (1, 3)      [[-41.25255558  39.76387596   1.67126411]]
        "opponent_goal_front_right",  # (1, 2)      [[-32.59322043  44.98114335]]
        "field_back_right",  # (1, 2)       [[26.51893661  8.6042296 ]]

Where could we find a detailed meaning of these variables? Thanks!

Frank-Dz commented 2 years ago

Specifically, although we could find the definition in observable But like the function at Line187, the meaning of each variable is still not clear... https://github.com/deepmind/dm_control/blob/master/dm_control/locomotion/soccer/observables.py#L187

kevinzakka commented 2 years ago

These observables generate values using the lambdas defined here. As you can see, these correspond to attributes of the goal posts, which are all subclasses of PositionDetector, defined here. From a quick glance, looks like they correspond to the Cartesian coordinates of the goal post's legs and center point.

Frank-Dz commented 2 years ago

These observables generate values using the lambdas defined here. As you can see, these correspond to attributes of the goal posts, which are all subclasses of PositionDetector, defined here. From a quick glance, looks like they correspond to the Cartesian coordinates of the goal post's legs and center point.

Thanks for your explanation. I understand the variables regarding the goal. But how about the

"end_effectors_pos",  # (1, 3)      [[0. 0. 0.]]
 "joints_pos",  # (1, 1)        [[-0.00367045]]
 "joints_vel",  # (1, 1)        [[0.03756077]]

So seems like there is no such kind of information for specific explanation of the variable :(

kevinzakka commented 2 years ago

@Frank-Dz Those are common across not just the soccer tasks but the walker agents in the locomotion module. Specifically, you can check the CMU humanoid class described here for more information.

joints_pos and joints_vel are the respective positions and velocities of the joints of the humanoid agent. end_effectors_pos refers to the position of the end effectors of the humanoid agent relative to its torso, in the egocentric frame. The end effectors are defined here and they're basically the right/left hands/feet.

The above might differ for different soccer agents, so you'll have to check.

Frank-Dz commented 2 years ago

@Frank-Dz Those are common across not just the soccer tasks but the walker agents in the locomotion module. Specifically, you can check the CMU humanoid class described here for more information.

joints_pos and joints_vel are the respective positions and velocities of the joints of the humanoid agent. end_effectors_pos refers to the position of the end effectors of the humanoid agent relative to its torso, in the egocentric frame. The end effectors are defined here and they're basically the right/left hands/feet.

The above might differ for different soccer agents, so you'll have to check.

Thank you so much! I think I get the idea!

Btw, I have the last question. For a soccer environment, we could use random_state to specify the seed for the random state of the initial state. But what if we want the two teams always start from a specific formation? Like: image The red and blue teams are always facing to each other in at the start of the match... huh

Frank-Dz commented 2 years ago

@Frank-Dz Those are common across not just the soccer tasks but the walker agents in the locomotion module. Specifically, you can check the CMU humanoid class described here for more information.

joints_pos and joints_vel are the respective positions and velocities of the joints of the humanoid agent. end_effectors_pos refers to the position of the end effectors of the humanoid agent relative to its torso, in the egocentric frame. The end effectors are defined here and they're basically the right/left hands/feet.

The above might differ for different soccer agents, so you'll have to check.

but why the shapes of joint_pos and joint_vel are (1,1)? There is only one element, it should be 3 elements (xyz)?

My fault, this is because the degree of freedom of this joint is only 1 I think.