huggingface / lerobot

🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning
Apache License 2.0
5.62k stars 481 forks source link

Is there any documentation to create a custom dataset? #304

Open HiroIshida opened 1 month ago

HiroIshida commented 1 month ago

lerobot/examples elaborates on how to load and train using the existing dataset on the hugging face repos. Rather I'd like to know how to turn self-collected data into the dataset. So, I'd like to if there is some documentation for that.

RochMollero commented 1 month ago

Same here. I have my data ready but the datasets class seem rather complex to instantiate, so one (or more depending on the number of camera for examples) examples would be very nice.

TheArtificialOutsider commented 1 month ago

same here!I really need an example.

zwbx commented 1 month ago

same question!

Cadene commented 1 month ago

@TheArtificialOutsider @zwbx @RochMollero Got it! We will address this issue very soon, and simplify stuff ;)

Any chance you could provide a very short sample of the datasets in the comment?

In the meantime, a few pointers and ressources:

README:

See how we use from_preloaded:

See the content of these files to instantiate the hf_dataset, encode the videos, or store frames, etc.

Cadene commented 1 month ago

cc @michel-aractingi for visibility ;)

zwbx commented 1 month ago

Hi, thanks for your attention to this matter. I am using RLbench dataset now. I have raw data now, containing image observations, actions. how can I organize them and transfer them to the hf dataset?

Get Outlook for iOShttps://aka.ms/o0ukef


From: Remi @.> Sent: Thursday, July 11, 2024 7:08:41 AM To: huggingface/lerobot @.> Cc: Wenbo Zhang @.>; Mention @.> Subject: Re: [huggingface/lerobot] Is there any documentation to create a custom dataset? (Issue #304)

CAUTION: External email. Only click on links or open attachments from trusted senders.


cc @michel-aractingihttps://github.com/michel-aractingi for visibility ;)

— Reply to this email directly, view it on GitHubhttps://github.com/huggingface/lerobot/issues/304#issuecomment-2221560936, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AJFDSL3NMJK2265XNQKPZDDZLWSWDAVCNFSM6AAAAABKL436HWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMRRGU3DAOJTGY. You are receiving this because you were mentioned.Message ID: @.***>

x2ss commented 21 hours ago

It would be even better if there were tutorials on how to train using custom data from the gym simulation environment.

Cadene commented 10 hours ago

Hey there, we are still working on a simplification of the dataset class + upload to hub + tutorial! :) Sorry if it's taking some time!

Unfortunately, the only option now is to get familiar with: https://github.com/huggingface/lerobot/blob/main/lerobot/scripts/push_dataset_to_hub.py

See some example commands in header. You can eventually adapt one of them to your dataset format. If you have issue understanding the code, reach out to us on discord #help channel