antonilo / vision_locomotion

Project Code for the paper "Learning Visual Locomotion with Cross-Modal Supervision" (ICRA2023)
https://antonilo.github.io/vision_locomotion/
66 stars 10 forks source link

Falling down when deployed in Go1 #7

Open dstx123 opened 9 months ago

dstx123 commented 9 months ago

I tried to retrain the RMA algorithm using go1 urdf, but found that the robot would tend to lean forward or fall over. Do you have any suggestions for this? The parameter "rl_coeff" is set to 1. Below are the visual results of the two stages. Trained 24,000 rounds in the first phase:

https://github.com/antonilo/vision_locomotion/assets/51892853/f6736bcd-b444-4e28-a7fd-d6f69269d38e

Trained 1200 rounds in the second stage:

https://github.com/antonilo/vision_locomotion/assets/51892853/b76acec5-cacc-401b-8761-5cf9137ef631

result graph:

teacher student

By the way, what does the parameter "jt_mean_pos" in "Environment.hpp" mean? Will this affect go1's training? And why do the hip joints of robots trained by the RMA algorithm tend to expand? This seems to make the gait appear unnatural.

antonilo commented 8 months ago

How does your phase I policy work when commanding a velocity of zero? The problem might be there and not in the dagger. By a quick inspection, it looks like the work penalty might be too high when the commanded action is zero. jt_pos_mean is the standard position of the joints. You can use the same in the A1 and Go1 (I tested them). The extension comes from the fact that a single policy has to do everything (steps and flat). If you increase the number of points the gait will be more natural.