reproduce the released checkpoints

andyaloha commented 1 month ago

Could you please provide the process for reproducing the training of the 'cousin_ckpt.pth' and 'twin_ckpt.pth' files? Thank you.

cremebrule commented 4 weeks ago

Hi @andyaloha ,

Thanks for reaching out! We trained our policies on either exclusively the digital twin (twin_ckpt.pth) or 8 cousin models (cousin_ckpt.pth), collecting a total of 10000 demos in either case. We then trained both models using BC-RNN and deployed them immediately on different evaluation cases (i.e.: digital twin, or other unseen cabinets).

You can reproduce the hyperparameters used by inspecting the environment config file in the checkpoint itself!

RogerDAI1217 commented 3 weeks ago

Hi @andyaloha, to save your time, here is the models we used for the twin v.s. cousins experiments on door opening task.

Use DINO to select digital cousins:

Twin: kdbgpm
2 Cousins: tolupn, hrdeys
4 Cousins: tolupn, hrdeys, rvunhj, vceupd
8 Cousins: tolupn, hrdeys, rvunhj, vceupd, xpdbgf, pzjecb, vespxk, glefdh
Test Assets: kdbgpm, dajebq (Hold-out 2nd digital cousin), nrlayx (Hold-out 6th digital cousin), plccav

Use DINO + GPT to select digital cousins:

Twin: kdbgpm
2 Cousins: hrdeys, rvunhj
4 Cousins: hrdeys, rvunhj, nrlayx, nyfebf
8 Cousins: hrdeys, rvunhj, nrlayx, nyfebf, glefdh, smujam, vceupd, pzjecb
Test Assets: kdbgpm, dajebq (Hold-out 2nd digital cousin), xpdbgf (Hold-out 6th digital cousin), plccav
Due to LLM prompting, the digital cousin selection can contain some randomness

We trained BC-RNN with different hyperparameters. The average success rates are reported in Table 3, Table 4, Table 5, Figure 7, Figure 13, and Figure 14 in Appendix.

andyaloha commented 3 weeks ago

@cremebrule @RogerDAI1217 Thanks. I have trained a valid policy according to your instruction with twin of kdbgpm and 8 cousins in the config of twin_ckpt.pth. While in evaluation, it seems that the target object must be one of those in the loaded scene? This scene is specified in the demos collection step by scene_path, which is saved into the .hdf5 file and loaded as the test scene in the evaluation step. How can I change the target object to others not in the loaded scene (such as different hold-out cousins)?

cremebrule commented 3 weeks ago

Hi @andyaloha ,

Great question. I believe we already have this implemented -- you can simply set the eval_category_model_link_name in the evaluation script examples/4_evaluate_policy.py. Can you try that?

andyaloha commented 2 weeks ago

@cremebrule I have tried it. No matter how I change the target object by eval_category_model_link_name, the final target object is always the same kdbgpm. It seems that the target object is fixed by the scene_path as in the demos collection step, which is saved into the .hdf5 file and loaded as the test scene in the evaluation step.

cremebrule commented 1 week ago

Hi @andyaloha ,

Hmmmm that's strange. I just tested our twin checkpoint now with --eval_category_model_link_name "bottom_cabinet,dajebq,link_3", and it seems to work fine.

Do you have a repro (checkpoint + command you executed) that we can try?

cremebrule / digital-cousins

reproduce the released checkpoints #13

Use DINO to select digital cousins:

Use DINO + GPT to select digital cousins: