huggingface / lerobot

đŸ¤— LeRobot: Making AI for Robotics more accessible with end-to-end learning
Apache License 2.0
7.67k stars 738 forks source link

Recording policies with `--root` argument set fails if you've recorded a policy before #534

Open apockill opened 2 hours ago

apockill commented 2 hours ago

System Info

lerobot-1  | Traceback (most recent call last):
lerobot-1  |   File "/tmp/ee8739f1-22d7-410a-afe3-ab4eaf2289a7/lerobot/scripts/control_robot.py", line 555, in <module>
lerobot-1  |     record(robot, **kwargs)
lerobot-1  |   File "/tmp/ee8739f1-22d7-410a-afe3-ab4eaf2289a7/lerobot/common/robot_devices/utils.py", line 28, in wrapper
lerobot-1  |     raise e
lerobot-1  |   File "/tmp/ee8739f1-22d7-410a-afe3-ab4eaf2289a7/lerobot/common/robot_devices/utils.py", line 24, in wrapper
lerobot-1  |     return func(robot, *args, **kwargs)
lerobot-1  |   File "/tmp/ee8739f1-22d7-410a-afe3-ab4eaf2289a7/lerobot/scripts/control_robot.py", line 249, in record
lerobot-1  |     dataset = LeRobotDataset.create(
lerobot-1  |   File "/tmp/ee8739f1-22d7-410a-afe3-ab4eaf2289a7/lerobot/common/datasets/lerobot_dataset.py", line 925, in create
lerobot-1  |     obj.meta = LeRobotDatasetMetadata.create(
lerobot-1  |   File "/tmp/ee8739f1-22d7-410a-afe3-ab4eaf2289a7/lerobot/common/datasets/lerobot_dataset.py", line 291, in create
lerobot-1  |     obj.root.mkdir(parents=True, exist_ok=False)
lerobot-1  |   File "/usr/lib/python3.10/pathlib.py", line 1175, in mkdir
lerobot-1  |     self._accessor.mkdir(self, mode)
lerobot-1  | FileExistsError: [Errno 17] File exists: 'data'

Information

Reproduction

Run control_robot.py record --root "data" and it will fail if the data directory already exists.

Expected behavior

It should be okay to have many datasets under your root directory.

{root}/{HF_USER}/{dataset_name}
{root}/{HF_USER}/{dataset_name}
{root}/{HF_USER}/{dataset_name}

In the above example, only dataset_name should be forced to be unique.

apockill commented 2 hours ago

I'm a bit confused by lines around the codebase doing

Path(root) if root is not None else LEROBOT_HOME / repo_id

Does this imply that root should already contain the full root/user/dataset_name within it? Intuitively I would have expected it should be:

Path(root) if root is not None else LEROBOT_HOME

Without the / repo_id. I'm happy to submit a PR once I know what the intended meaning of root is within the new 2.0 system!