huggingface / lerobot

🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning
Apache License 2.0
7.67k stars 738 forks source link

Control simulated robot with real leader #514

Open michel-aractingi opened 1 week ago

michel-aractingi commented 1 week ago

What this does

Adds a script control_sim_robot.py in lerobot/scripts that has the same functionality and interface as control_robot.py but for simulated environments.

The script has three control modes:

The dataset created contains more columns related to reinforcement learning like next.reward, next.success and seed.

Simulation environments

Along with the --robot-path argument, the scripts requires a path the configuration file of the simulation environment -- define in lerobot/configs/env. Example of the configuration file for gym_lowcostrobot:

env:
  name: lowcostrobot
  fps: ${fps}
  handle: PushCubeLoop-v0
  state_dim: 12
  action_dim: 6

  gym:
    render_mode: human
    max_episode_steps: 100000

calibration:
  axis_directions: [-1, -1, 1, -1, -1, -1]
  offsets: [0, -0.5, -0.5, 0, -0.5, 0] # factor of pi

eval:
  use_async_envs: false

state_keys:
  observation.state: 'arm_qpos'
  observation.velocity: 'arm_qvel'

Essential elements:

  1. Name of the environment that defines the gym package to be able to import it correctly.
  2. The environment handle to create the specific task within the package.
  3. Calibration arguments: Necessary for transforming the real leader position to the simulated ones.
  4. State keys (optional): for mapping the different state names to lerobot standard state keys in the LeRobotDataset.

How to test

First install the gym_lowcostrobot environment and add the environment's config file in yaml format.

Test teleoperation:

python lerobot/scripts/control_sim_robot.py teleoperate \
  --robot-path lerobot/configs/robot/koch.yaml --sim-config lerobot/configs/env/mujoco.yaml 

Test data collection and upload to hub:

python lerobot/scripts/control_sim_robot.py record  --robot-path lerobot/configs/robot/koch.yaml --sim-config lerobot/configs/env/mujoco.yaml   --fps 40  --root data --repo-id $USER/test_mujoco  --episode-time-s 30  --num-episodes 50   --push-to-hub 1  

Replay the episodes:

python lerobot/scripts/control_sim_robot.py replay   --fps 30   --root data    --repo-id $USER/test_mujoco    --episode 0 1 2 3 --sim-config lerobot/configs/env/mujoco.yaml

In the script we save the seed in the dataset which enables us to reset the environment in the same state when the data collection was happening which makes the replay successful.

Finally visualize the dataset:

python lerobot/scripts/visualize_dataset_html.py --repo-id $USER/test_mujoco --episodes 0 1 2 3 4

TODO:

Note: The current script requires a real leader in order to teleoperate sim environments. We can add support for keyboard control of the end effector for people who don't have the real robot.

michel-aractingi commented 4 days ago

Thanks @marinabar who tested the script on her setup.

apockill commented 3 days ago

I'm noticing quite a bit of the new scripts could be DRY'ed up, since it's rehashing a fair bit of the original control_robot.

I'm curious- what issues are you finding with using control_robot with simulated environments? Maybe there are ways of improving control_robot so it better handles abstractions like simulation. That way there's less code to maintain :grin:

michel-aractingi commented 3 days ago

Hey @apockill! You're right, we might be able to find a general solution in control_robot.py, but I feel there are few elements that could make the script ugly.

  1. env vs robot. In control_robot.py reading and writing to the robot is done using only the Robot class. In simulation, we would require an additional environment instance along with the robot that has to also be passed to all the functions. We could maybe modify lerobot/common/robot_devices/control_utils.py for instance and put ifs and elses everywhere to account for that but I think it would add an unnecessary complexity.
  2. There are some smaller details like the dataset you acquire in simulation also has additional labels that we don't have in the real environments like rewards, successes and env seed. Also handling the frame rate fps on the real system vs. in simulation is different.

So even though the two scripts resemble each other I still think it is cleaner to have them separate. What do you think? If you have some vision of how we can improve on that or merge the two scripts I would be happy to chat or have a look :D