m-abr / FCPCodebase

FC Portugal Codebase
GNU General Public License v3.0
35 stars 7 forks source link

About Basic_Run Training #13

Closed Aike2071 closed 8 months ago

Aike2071 commented 8 months ago

I tried to train the Walk behavior using Basic_Run.py in the codebase,but the pkl file outputs an array in the size of 22 instead of 16 like that in the original Walk.Do the 22 action indexes match the joint index of the robot?

m-abr commented 8 months ago

The walk skill that you can find at /behaviors/custom/Walk/ is a bit different from the skill that is learned through /scripts/gyms/Basic_Run.py.


The Walk has a neural network that controls 16 actions:

These actions are added to a step trajectory generator with fixed parameters (step duration: 8, step vertical span: 0.02 m, step z max: 70%). The final action is then converted into joint positions through inverse kinematics.


In Basic_Run.py, the neural network controls 22 actions:

The actions are then processed as follows:

  1. self.player.behavior.execute("Step", ...) is called to generate the next values for the step trajectory. It uses inverse kinematics internally to compute joint positions. In the first time step of each episode, the step vertical span and the step z max indicated by the neural network are used to configure the step trajectory generator. After that, the last two values produced by the neural network are ignored.
  2. A new vector of scaled joint positions called new_action is created from the neural network's output: new_action = self.act[:20] * 2 # scale up actions to motivate exploration
  3. The joint positions computed in the 1st step are extracted from self.step_obj and added to new_action:
    new_action[[0,2,4,6,8,10]] += self.step_obj.values_l
    new_action[[1,3,5,7,9,11]] += self.step_obj.values_r
  4. Some biases are added to control the initial position of the robot:
    new_action[12] -= 90 # arms down
    new_action[13] -= 90 # arms down
    new_action[16] += 90 # untwist arms
    new_action[17] += 90 # untwist arms
    new_action[18] += 90 # elbows at 90 deg
    new_action[19] += 90 # elbows at 90 deg
  5. new_action is assigned to robot joints [2-21]:
    r.set_joints_target_position_direct( # commit actions:
    slice(2,22),        # act on all joints except head & toes (for robot type 4)
    new_action,         # target joint positions 
    harmonize=False     # there is no point in harmonizing actions if the targets change at every step  
    )

Note that these 2 approaches are just examples of what can be accomplished. You can modify these control methods to better suit your needs.

Aike2071 commented 8 months ago

Thank you so much!!