Discrepancy between action vector and `joints_angles` observation

Mcibula commented 7 months ago

Hello!

I hope I can post this a bit longer question here.

I am testing an integrated motor babbling control with -ct random argument for panda robot with following relevant config:

  "robot": "panda",
  "robot_action": "joints",
  "robot_init": [-0.4, 0.6, 0.5],
  "max_velocity": 10,
  "max_force": 100,
  "action_repeat": 1,

  "task_type": "reach",
  "task_objects": [
    {
      "init": {
        "obj_name": "cube_holes",
        "fixed": 0,
        "rand_rot": 0,
        "sampling_area": [-0.3, 0.3, 0.4, 0.6, 0.1, 0.1]
      },
      "goal": {
        "obj_name": "cube_holes",
        "fixed": 1,
        "rand_rot": 1,
        "sampling_area": [5, 5, 5, 5, 5, 5]
      }
    }
  ],
  "observation": {
    "actual_state": "obj_xyz",
    "goal_state": "obj_xyz",
    "additional_obs": ["joints_angles"]
  },

All I try to do is to print out a performed random action and subsequent observation from the environment in the timestep loop of test_env() function in test.py:

print(f"Action:{action}")
observation, reward, done, info = env.step(action)

print(f'Observation: {observation}')

If I understand it correctly, a sampled action in this case should be a new joint configuration, so the joints_angles segment of the subsequent observation vector should be almost the same as the action vector. It seems to work like this when using slider control:

Observation: [-1.10144916e-01  4.91167079e-01  7.49899258e-02  4.99990013e+00
  4.99999489e+00  5.00000000e+00  4.37333765e-01  9.72781516e-01
  1.87621376e-01 -1.22585735e-01  1.31177055e+00  1.44094002e+00
  2.36656491e-07  0.00000000e+00]
Action:[0.4373335838317871, 0.9727801084518433, 0.1876215934753418, -0.12258744239807129, 1.3117706775665283, 1.4409408569335938, 0.0]
Observation: [-1.10144915e-01  4.91167079e-01  7.49899302e-02  4.99990013e+00
  4.99999489e+00  5.00000000e+00  4.37333765e-01  9.72781516e-01
  1.87621376e-01 -1.22585735e-01  1.31177055e+00  1.44094002e+00
  2.36656491e-07  0.00000000e+00]`

However, when using random control, those values do not seem to correspond:

Action:[-0.51481533  1.6324749   2.1036975  -0.27616423  0.8946579   3.0306137
 -2.3106954 ]
Observation: [-0.08972217  0.40780999  0.09965184  5.00003808  4.99990754  5.
  0.41807567  0.9652326   0.22699737 -0.16194521  0.21542613  1.48197385
 -0.04166675  0.        ]
Action:[-2.3407836   0.99039567  2.1191826  -1.9024386  -1.879574    3.793361
 -1.682037  ]
Observation: [-0.08972217  0.40780999  0.0992242   5.00003808  4.99990754  5.
  0.39251881  0.96309263  0.27012038 -0.20360899  0.17760391  1.52364036
 -0.08333349  0.        ]
Action:[ 2.4431515 -1.5354003 -1.1559559 -1.2074044  2.5261626  3.121788
  1.876207 ]
Observation: [-0.08972217  0.40780999  0.09862862  5.00003808  4.99990754  5.
  0.37897825  0.95914598  0.22845184 -0.24527874  0.21926947  1.56530642
 -0.04166699  0.        ]

So I wanted to ask, whether I misunderstood what the action vector represents, or whether there is a problem causing this discrepancy.

Thank you very much.

michalvavrecka commented 7 months ago

Hi Miro. The problem is in randomness of the actions sampled. We use this mode only to test, whether all joints are moving. If you use them to control action, you will face the difference between PLAN and REALITY. Random parameter will sample any number to control joints, but s they are executed within one simulation step, the arm will move toward this direction based on speed and force. It is not able to reach distant positions within one step (physical limitation). You can use random parameter, but you need to increase speed and force in config: "max_velocity" :30, "max_force" :500,

if you want to reach even distant position in one step you have increase third parametr: "action_repeat" :20 it will run 20 simulation steps between each action to reach even distant goals.

But as I said, random is not good way to control robot. If you want motoric babbling, run the untrained network in step robot_control mode, it will guarantee you small increase in end effector position, reachable within one simulation step

Mcibula commented 7 months ago

Oh, I understand now; thank you.

gabinsane commented 6 months ago

Closing this issue, let us know if you need any more help

incognite-lab / myGym

Discrepancy between action vector and `joints_angles` observation #46