cremebrule / digital-cousins

Codebase for Automated Creation of Digital Cousins for Robust Policy Learning
https://digital-cousins.github.io
Apache License 2.0
147 stars 16 forks source link

align_model_pose #11

Closed yanxinhao closed 1 week ago

yanxinhao commented 1 month ago

WeChat40f250e707b3bdbb507d5a5355f22460

pc_obj is still in the input frame's camera space, but z_rot_mat should be used in the OG world space. Maybe this rotation can not align the pose of pc_obj and obj model.

cremebrule commented 1 month ago

Hi @yanxinhao ,

Thanks for bringing this up! To be clear, the point cloud (in the camera image frame) first gets rotated such that it is aligned with the OG world frame z axis -- this is done by undoing the tilt_angle (angle between the floor plane and the camera pose expressed in the camera image frame). Then, we undo the resultant z rotation by applying the matched cousin z-pose and pan offset calculated from our aggregated dataset. At this point, the point cloud should be expressed in OG world space, albeit with a translational offset!

If you're getting bad orientation matches, it's possible that GPT / DINO simply selects bad orientations during the cousin matching process, which can happen under certain cases. Do you have an example that showcases this behavior?

yanxinhao commented 1 month ago

Hi @cremebrule , Thanks for your patient response. From my understanding, I think that the z rotation transformation should be applied to the object which must be centered, but the 'pc_obj @ tilt_mat' is just aligned with the ground, not centered.

Maybe I get something unclear about the OG world space. I am not very familiar with the process of OG dataset. Where can I find the document? In my project, I wanna put a generated object's obj/glb file into the scene. 29162404b5b3606124d616b86294d8f

cremebrule commented 4 weeks ago

Hi @yanxinhao ,

Ah, yes, that is a good point. We don't need to center it because we assume our model's cousin pose selection has already taken into account any translation offset (since we pass the masked object image to DINO / GPT during cousin selection).

We did try pre-centering the masked object so that the effective relative camera poses with respect to the object were the same between the masked object and all dataset images, but found that the results were not as good (likely due to the fact that viewing the effective point cloud from a different angle results in holes in the new 2D camera view projection, which could be filled using heuristics but did not always look realistic).

Regarding a custom dataset, you can probably use custom assets, as long as the dataset is constructed in the same was as the OG dataset. We've hardcoded the numbers for now, though we can make them modifiable macros in a future release to allow for more modular extension.

You'll also need to format your assets to be compatible with OmniGibson. This mainly consists of converting them into USD-type files, and importing them as a USDObject instead of DatasetObject (HERE and HERE during the scene generation process).

cremebrule commented 1 week ago

Closing this issue for now as there's been no response for a few weeks. Feel free to re-open if you continue to run into issues!