nv-nguyen / gigapose

[CVPR 2024] PyTorch implementation of GigaPose: Fast and Robust Novel Object Pose Estimation via One Correspondence
https://nv-nguyen.github.io/gigaPose/
MIT License
144 stars 12 forks source link

Questions about the paper #5

Closed xiexie123 closed 9 months ago

xiexie123 commented 9 months ago

Thanks for your great work!

I am reading paper. In section 3.3 said " To recover the remaining 2 DoFs, scale s and in-plane rotation α, we train deep networks to directly regress these values from a single 2D-2D correspondence. Since the feature extractor Fae is invariant to in-plane rotation and scaling, the corresponding features cannot be used to regress those values, hence we have to train another feature extractor we call Fist".

why? A simple Image regestring method based on SIFT descriptors, and SIFT descriptors is invariant to in-plane rotation and scaling.

xiexie123 commented 9 months ago

I see. You use feature to regress scale s and in-plane rotation α. another question, why not use correspondences and RANSAC to get scale s and in-plane rotation α, translation t like a simple image regestring method based on SIFT descriptors, Since you have got the the correspondences from Fae ?

nv-nguyen commented 9 months ago

Thanks for your interest!

We show in Table 4 the ablation study with different ways to predict the scale, and in-plane rotation from multiple correspondences as you mentioned (n=2 or n=4). Our method predicts fully 6D pose from a single correspondence (n=1) which is different and outperforms other approaches.

nv-nguyen commented 9 months ago

I closed the issue but feel free to re-open it again if you have additional questions!