lauramsmith / fine-tuning-locomotion

Apache License 2.0
114 stars 22 forks source link

Legged Robots that Keep on Learning

Official codebase for Legged Robots that Keep on Learning: Fine-Tuning Locomotion Policies in the Real World, which contains code for training a simulated or real A1 quadrupedal robot to imitate various reference motions, pre-trained policies, and example training code for learning the policies.

animated

Project page: https://sites.google.com/berkeley.edu/fine-tuning-locomotion

Getting Started

Install dependencies:

Training Policies in Simulation

To train a policy, run the following command:

python3 motion_imitation/run_sac.py \
--mode train \
--motion_file [path to reference motion, e.g., motion_imitation/data/motions/pace.txt] \
--int_save_freq 1000 \
--visualize

Evaluating and/or Fine-Tuning Trained Policies

We provide checkpoints for the pre-trained models used in our experiments in motion_imitation/data/policies/.

Evaluating a Policy in Simulation

To evaluate individual policies, run the following command:

python3 motion_imitation/run_sac.py \
--mode test \
--motion_file [path to reference motion, e.g., motion_imitation/data/motions/pace.txt] \
--model_file [path to imitation model checkpoint, e.g., motion_imitation/data/policies/pace.ckpt] \
--num_test_episodes [# episodes to test] \
--use_redq \
--visualize

Autonomous Training using a Pre-Trained Reset Controller

To fine-tune policies autonomously, add a path to a trained reset policy (e.g., motion_imitation/data/policies/reset.ckpt) and a (pre-trained) imitation policy.

python3 motion_imitation/run_sac.py \
--mode train \
--motion_file [path to reference motion] \
--model_file [path to imitation model checkpoint] \
--getup_model_file [path to reset model checkpoint] \
--use_redq \
--int_save_freq 100 \
--num_test_episodes 20 \
--finetune \
--real_robot

To run two SAC trainers, one learning to walk forward and one backward, add a reference and checkpoint for another policy and use the multitask flag.

python motion_imitation/run_sac.py \
--mode train \
--motion_file motion_imitation/data/motions/pace.txt \
--backward_motion_file motion_imitation/data/motions/pace_backward.txt \
--model_file [path to forward imitation model checkpoint] \
--backward_model_file [path to backward imitation model checkpoint] \
--getup_model_file [path to reset model checkpoint] \
--use_redq \
--int_save_freq 100 \
--num_test_episodes 20 \
--real_robot \
--finetune \
--multitask

Running MPC on the real A1 robot

Since the SDK from Unitree is implemented in C++, we find the optimal way of robot interfacing to be via C++-python interface using pybind11.

Step 1: Build and Test the robot interface

To start, build the python interface by running the following: bash cd third_party/unitree_legged_sdk mkdir build cd build cmake .. make Then copy the built robot_interface.XXX.so file to the main directory (where you can see this README.md file).

Step 2: Setup correct permissions for non-sudo user

Since the Unitree SDK requires memory locking and high-priority process, which is not usually granted without sudo, add the following lines to /etc/security/limits.conf:

<username> soft memlock unlimited
<username> hard memlock unlimited
<username> soft nice eip
<username> hard nice eip

You may need to reboot the computer for the above changes to get into effect.

Step 3: Test robot interface.

Test the python interfacing by running: 'sudo python3 -m motion_imitation.examples.test_robot_interface'

If the previous steps were completed correctly, the script should finish without throwing any errors.

Note that this code does not do anything on the actual robot.

Running the Whole-body MPC controller

To see the whole-body MPC controller in sim, run: bash python3 -m motion_imitation.examples.whole_body_controller_example

To see the whole-body MPC controller on the real robot, run: bash sudo python3 -m motion_imitation.examples.whole_body_controller_robot_example