facebookresearch / r3m

Pre-training Reusable Representations for Robotic Manipulation Using Diverse Human Video Data
https://sites.google.com/view/robot-r3m/
MIT License
292 stars 45 forks source link

Can the robot achieve a certain task from man without language? #33

Open Andyyoung0507 opened 1 year ago

Andyyoung0507 commented 1 year ago

Hi, I am interested in this work. In the issues #5, you mentioned that the pre-trained R3M model simply acts as an encoder mapping images to embeddings. My question is how to use the whole framework in the downstream robotic task after behavior cloning? Just given an image, then the robot can do the imitated work? How about in a more sophisticated environment? How can the robot achieve a certain task from man without language?