jcorsetti / oryon

Official implementation of CVPR24 Highlight paper "Open-vocabulary object 6D pose estimation"
20 stars 2 forks source link

pose estimation #5

Closed zingaltern closed 5 days ago

zingaltern commented 2 months ago

Hi, I want konw if your method estimate the relative pose of same object between two scenes rather the absolute object pose? if it focus on relative pose, why not compare whit Relpose and Relpose++?

zingaltern commented 2 months ago

By the way, I want confirm that if the work focus on object segmentation instead of pose estimation, as the pose was obtained by point cloud registration.

jcorsetti commented 2 months ago

Hi @zingaltern, thank you for your interest in our work. It is true that we also focus on relative pose estimation, but the assumption of our method and RelPose/RelPose++ are quite different, as those methods

  1. use RGB only frames, opposed to RGBD
  2. Assume the object to be centered in the image (i.e., a prior crop was obtained)
  3. Can only obtain the translation component up to a scale, instead of an absolute value Therefore a direct comparison between Oryon and Relpose/relpose++ would not have been fair.

On your second question: Oryon indeed focuses on pose estimation. The segmentation part is necessary to obtain the object region in order to filter the matches and avoid using external detectors as it's common in other methods. Also using point cloud registration module is quite common in the literature, many works rely on methods like Ransac/Teaser etc to obtain the pose given the predicted correspondences.

Let me know if you have more questions.

zingaltern commented 2 months ago

Thanks for your reply, you say Oryon focuses on pose estimation, have you try to use other deep learning based point cloud registration methods for exploring the influence of this step?(point cloud registration to pose accuracy) As for I want to confirm if it's focused on pose estimation, it is because I think it is possible to use other existing open-word segmentation methods in terms of segmentation.

jcorsetti commented 2 months ago

No, we did not try other deep learning methods for registration. Besides PointDSC we tried ransac as detailed in the ablation and MAC, which performed a bit worse that PointDSC. In principle any architecture based on correspondences can be used in place of PointDSC.

Of course you can only train Oryon for matching and use an external segmentor to get the mask, this is similar to what we did when we reported the results with OVSeg masks.