Open zengweishuai opened 2 days ago
do you mind share a video/gif for the gait you are refering to? 25000 for privileged policy might not be enough. Try 100000iters and train with the default reward/curriculum
Hi @TairanHe, thanks for your reply!
There is a gif describing the situation I've encountered when selecting the 26500 checkpoints for the privileged policy:
The reason why I choose such checkpoint is that I notice in the reward result graph, the reward would decline after about 1.5M steps. So I choose a checkpoint between 1M steps and 1.5M steps (between checkpoint 25000 and 35000). Is the training graph correct? The graph is attached below:
The default reward you've mentioned refers to using the rewards_base.yaml to train?
Thanks again for your quick reply!
Hi, thanks for your great work again!
After training the privileged policy and the distilled policy, I notice that the gait of the humanoid robot seems a little weird. To be specific, when it tries to walk forward, there is only one leg trying to follow the reference motion while the other falls behind as if the robot were lamp.
The checkpoint I use is 25000 for privileged policy and 18000 for distilled policy.
Can you share some clues why this may happen? I am sorry for my disturbance. Expect your reply!