LeCAR-Lab / human2humanoid

[IROS 2024] Learning Human-to-Humanoid Real-Time Whole-Body Teleoperation. [CoRL 2024] OmniH2O: Universal and Dexterous Human-to-Humanoid Whole-Body Teleoperation and Learning
https://omni.human2humanoid.com/
240 stars 11 forks source link

Question about the Gait #17

Open zengweishuai opened 2 days ago

zengweishuai commented 2 days ago

Hi, thanks for your great work again!

After training the privileged policy and the distilled policy, I notice that the gait of the humanoid robot seems a little weird. To be specific, when it tries to walk forward, there is only one leg trying to follow the reference motion while the other falls behind as if the robot were lamp.

The checkpoint I use is 25000 for privileged policy and 18000 for distilled policy.

Can you share some clues why this may happen? I am sorry for my disturbance. Expect your reply!

TairanHe commented 1 day ago

do you mind share a video/gif for the gait you are refering to? 25000 for privileged policy might not be enough. Try 100000iters and train with the default reward/curriculum

zengweishuai commented 1 day ago

Hi @TairanHe, thanks for your reply!

There is a gif describing the situation I've encountered when selecting the 26500 checkpoints for the privileged policy:

priv_3

The reason why I choose such checkpoint is that I notice in the reward result graph, the reward would decline after about 1.5M steps. So I choose a checkpoint between 1M steps and 1.5M steps (between checkpoint 25000 and 35000). Is the training graph correct? The graph is attached below:

result

The default reward you've mentioned refers to using the rewards_base.yaml to train?

Thanks again for your quick reply!