WangYixuan12 / d3fields

[CoRL 24] D^3Fields: Dynamic 3D Descriptor Fields for Zero-Shot Generalizable Robotic Manipulation
https://robopil.github.io/d3fields/
MIT License
108 stars 6 forks source link

Confusing about optimize #3

Open Sjey-Lyn opened 11 months ago

Sjey-Lyn commented 11 months ago

Thank you for your excellent work! I am confused about one part, Is it feasible to optimize without dynamic model? And whether the cost function can be interpreted as the pixel difference between key points?

WangYixuan12 commented 11 months ago

Hi, thank you for your interest in our work! Yes, it is possible to optimize without dynamic models. If the action is only pick and place, you can assume the dynamics model is just a 3D rigid transformation. For your second question, yes it is correct.

Sjey-Lyn commented 11 months ago

Thank you for your answer, is it convenient to provide the code for the robotic arm simulation?

WangYixuan12 commented 11 months ago

I may consider that if many people want it, but it may take a while. (pls leave a thumbup if you want this feature).

Sjey-Lyn commented 11 months ago

Okay, thanks.

WangYixuan12 commented 11 months ago

I will leave this issue open so that people could comment if they want this feature.

Sjey-Lyn commented 11 months ago

I have another question about how to choose a reference camera to project 3D keypoints into 2D images?

WangYixuan12 commented 11 months ago

Currently, we are manually setting the reference camera pose. But it is possible to do it automatically.

Gloryseven commented 10 months ago

Thank you for your excellent work! May I ask how to train my own model?

WangYixuan12 commented 10 months ago

Hi, actually our work does not need to train the model. You only need the off-shelf foundation models.

Gloryseven commented 10 months ago

Hi, actually our work does not need to train the model. You only need the off-shelf foundation models.

Thank you for your timely reply! What are the GPU requirements for running this model? And how can I get the robotic simulation in the paper?

WangYixuan12 commented 10 months ago

We use Nvidia 3090 to run the model. For the simulation. I may need to find some time to organize the code. We use OmniGibson for simulation

Get Outlook for iOShttps://aka.ms/o0ukef


From: Gloryseven @.> Sent: Thursday, October 26, 2023 1:14:46 AM To: WangYixuan12/d3fields @.> Cc: Wang, Yixuan @.>; State change @.> Subject: Re: [WangYixuan12/d3fields] Confusing about optimize (Issue #3)

Hi, actually our work does not need to train the model. You only need the off-shelf foundation models.

Thank you for your timely reply! What are the GPU requirements for running this model? And how can I get the robotic simulation in the paper?

— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/WangYixuan12/d3fields/issues/3*issuecomment-1780476724__;Iw!!DZ3fjg!__R3UdWfHVnF3yC0eCtCy0aVdIeLMtoIHf3cWoAzMXqIdk9jwV9J0VCiGMlEOteIIOy6UjTW-vXl1Hxw44kTPkmcbKmKzg$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AHWV3D5G7SLAS3RVFPAZ2PDYBH5VNAVCNFSM6AAAAAA6ECQZUSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOBQGQ3TMNZSGQ__;!!DZ3fjg!__R3UdWfHVnF3yC0eCtCy0aVdIeLMtoIHf3cWoAzMXqIdk9jwV9J0VCiGMlEOteIIOy6UjTW-vXl1Hxw44kTPknX4UGVrA$. You are receiving this because you modified the open/close state.Message ID: @.***>

Gloryseven commented 10 months ago

May I ask where these datasets come from? How do I create my own dataset?

WangYixuan12 commented 10 months ago

I collect the dataset on my own using four RGBD cameras. You could use RGBD cameras and create data using a similar file structure.

Gloryseven commented 10 months ago

We use Nvidia 3090 to run the model. For the simulation. I may need to find some time to organize the code. We use OmniGibson for simulation Get Outlook for iOShttps://aka.ms/o0ukef ____ From: Gloryseven @.> Sent: Thursday, October 26, 2023 1:14:46 AM To: WangYixuan12/d3fields @.> Cc: Wang, Yixuan @.>; State change @.> Subject: Re: [WangYixuan12/d3fields] Confusing about optimize (Issue #3) Hi, actually our work does not need to train the model. You only need the off-shelf foundation models. Thank you for your timely reply! What are the GPU requirements for running this model? And how can I get the robotic simulation in the paper? — Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/WangYixuan12/d3fields/issues/3*issuecomment-1780476724__;Iw!!DZ3fjg!__R3UdWfHVnF3yC0eCtCy0aVdIeLMtoIHf3cWoAzMXqIdk9jwV9J0VCiGMlEOteIIOy6UjTW-vXl1Hxw44kTPkmcbKmKzg$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AHWV3D5G7SLAS3RVFPAZ2PDYBH5VNAVCNFSM6AAAAAA6ECQZUSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOBQGQ3TMNZSGQ__;!!DZ3fjg!__R3UdWfHVnF3yC0eCtCy0aVdIeLMtoIHf3cWoAzMXqIdk9jwV9J0VCiGMlEOteIIOy6UjTW-vXl1Hxw44kTPknX4UGVrA$. You are receiving this because you modified the open/close state.Message ID: @.***>

We use Nvidia 3090 to run the model. For the simulation. I may need to find some time to organize the code. We use OmniGibson for simulation Get Outlook for iOShttps://aka.ms/o0ukef ____ From: Gloryseven @.> Sent: Thursday, October 26, 2023 1:14:46 AM To: WangYixuan12/d3fields @.> Cc: Wang, Yixuan @.>; State change @.> Subject: Re: [WangYixuan12/d3fields] Confusing about optimize (Issue #3) Hi, actually our work does not need to train the model. You only need the off-shelf foundation models. Thank you for your timely reply! What are the GPU requirements for running this model? And how can I get the robotic simulation in the paper? — Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/WangYixuan12/d3fields/issues/3*issuecomment-1780476724__;Iw!!DZ3fjg!__R3UdWfHVnF3yC0eCtCy0aVdIeLMtoIHf3cWoAzMXqIdk9jwV9J0VCiGMlEOteIIOy6UjTW-vXl1Hxw44kTPkmcbKmKzg$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AHWV3D5G7SLAS3RVFPAZ2PDYBH5VNAVCNFSM6AAAAAA6ECQZUSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOBQGQ3TMNZSGQ__;!!DZ3fjg!__R3UdWfHVnF3yC0eCtCy0aVdIeLMtoIHf3cWoAzMXqIdk9jwV9J0VCiGMlEOteIIOy6UjTW-vXl1Hxw44kTPknX4UGVrA$. You are receiving this because you modified the open/close state.Message ID: @.***>

Hello! I'm very interested in this work. I want ask some question:

  1. Does the OmniGibson for simulation means the 'planning' part in the paper? And are all the experiments conducted in a simulation environment? So I do not need a real robotic arm.
  2. And is the part of the robotic code, along with the already open source code, is the entire code about the paper? Is the robotic arm simulation code writed by OmniGibson?
WangYixuan12 commented 10 months ago
  1. Yes, the sim code corresponds to the planning part of the paper. But we also have real robot experiments.
  2. Yes (+ planning code for real-world experiments)
Gloryseven commented 10 months ago

Thank you for your reply! If I use simulation robot experiments, will the dataset (RGBD images) be captured and generated by the OmniGibson simulation platform?

该邮件从移动设备发送

 

------------------ 原始邮件 ------------------ 发件人: "WangYixuan12/d3fields" @.>; 发送时间: 2023年11月10日(星期五) 上午10:43 @.>; @.**@.>; 主题: Re: [WangYixuan12/d3fields] Confusing about optimize (Issue #3)

Yes, the sim code corresponds to the planning part of the paper. But we also have real robot experiments.

Yes (+ planning code for real-world experiments)

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

WangYixuan12 commented 10 months ago

Yep

Gloryseven commented 10 months ago

Hi, actually our work does not need to train the model. You only need the off-shelf foundation models.

What about the pretrained model in the readme 'bash scripts/download_ckpts.sh'. Does it mean the Optimization process in the OmniGibson? Is the pretrained model conducted by OmniGibson?

WangYixuan12 commented 10 months ago

Omnigibson is a simulation platform, while the pre-trained model is to construct our representation

Bailey-24 commented 9 months ago

What do you see as the limitations of your work?

WangYixuan12 commented 9 months ago

I think the control part could be more advanced.

Sjey-Lyn commented 8 months ago

How many sets of matching points are needed to calculate the transformation matrix?

WangYixuan12 commented 8 months ago

More matching points lead to a more stable transformation matrix. A typical choice is 100 points

Bailey-24 commented 6 months ago

image the knife blade is not match the goal image

WangYixuan12 commented 6 months ago

Since DINOv2 cannot distinguish the sides of blades, it is expected that these two cannot match

Bailey-24 commented 6 months ago

I found the similar using DINO to find correspond feature in this repo. but he need the goal image's depth to find the transformation. When we use the goal image using AI to generate image, it does have depth information, what is you method? I just find in the code compare two images difference, https://github.com/WangYixuan12/d3fields/blob/5158f48ac6314bd9fbfea532b4c0e40a11493c17/fusion.py#L1729 , but i have no idea how to implement to get the transformation? Do you use the learned MPC to directly get the action? image

https://gist.github.com/normandipalo/fbc21f23606fbe3d407e22c363cb134e

WangYixuan12 commented 5 months ago

Actually, our method does not need a depth image to obtain the goal image. We assume that there is a floating reference camera in the workspace. The projected 2D image will be compared with goal image without the need for depth image