Improbable-AI / walk-these-ways

Sim-to-real RL training and deployment tools for the Unitree Go1 robot.
https://gmargo11.github.io/walk-these-ways/
Other
492 stars 129 forks source link

How to set this config for performing `Gait-free baseline model`? #27

Closed GuoPingPan closed 1 year ago

GuoPingPan commented 1 year ago

I want to know how to set the config file that can let the go1 performing Gait-free type. maybe turn Cfg.commands.gaitwise_curricula = False is no enough.

What's more, I noticed that go1_env_learn/ppo might use RMA which had not been use in this project. Why?ter/go1_gym_learn/ppo/ppo.py)` might use RMA which had not been use in this project. Why?

GuoPingPan commented 1 year ago

@gmargo11

gmargo11 commented 1 year ago

Hi @GuoPingPan ,

Sorry about the delayed response -- here's my feedback!

I want to know how to set the config file that can let the go1 performing Gait-free type. maybe turn Cfg.commands.gaitwise_curricula = False is no enough.

To train the gait-free baseline as used in the paper, you simply need to set the reward coefficients for gait-related terms for zero. Starting from https://github.com/Improbable-AI/walk-these-ways/blob/master/scripts/train.py#L122, modify the six "augmented auxiliary rewards" from Table 1 of the paper to have zero weight:

Cfg.reward_scales.jump = 0.0
Cfg.reward_scales.raibert_heuristic = 0.0
Cfg.reward_scales.feet_clearance_cmd_linear = 0.0
Cfg.reward_scales.orientation_control = 0.0
Cfg.reward_scales.tracking_contacts_shaped_force = 0.0
Cfg.reward_scales.tracking_contacts_shaped_vel = 0.0

What's more, I noticed that go1_env_learn/ppo might use RMA which had not been use in this project. Why?ter/go1_gym_learn/ppo/ppo.py)` might use RMA which had not been use in this project. Why?

Yes, that file implements a variant of RMA, which was not used in the paper or pretrained model. You'll notice that the file we actually use for training is https://github.com/Improbable-AI/walk-these-ways/blob/master/go1_gym_learn/ppo_cse/ppo.py which implements the state estimator based approach instead. We didn't examine the relative performance of these two in the Walk These Ways paper, but you can switch the code to test them out.

-Gabe

GuoPingPan commented 1 year ago

Thanks a lot.

GuoPingPan commented 12 months ago

@gmargo11 Though I set all the above reward scale to zero, It seems useless. I even try to modify as below:

if self.cfg.commands.gaitwise_curricula:
    # self.category_names = ['pronk', 'trot', 'pace', 'bound']
    self.category_names = ['trot']

or set gaitwise_curricula=False

So I want to know why.

OlalekanIsola commented 5 months ago

@GuoPingPan Were you able to address this issue? I am particularly interested in making the robot choose its gaiting automatically and not manually changing through the controller.