Closed xiexie123 closed 9 months ago
I see. You use feature to regress scale s and in-plane rotation α. another question, why not use correspondences and RANSAC to get scale s and in-plane rotation α, translation t like a simple image regestring method based on SIFT descriptors, Since you have got the the correspondences from Fae ?
Thanks for your interest!
We show in Table 4 the ablation study with different ways to predict the scale, and in-plane rotation from multiple correspondences as you mentioned (n=2 or n=4). Our method predicts fully 6D pose from a single correspondence (n=1) which is different and outperforms other approaches.
I closed the issue but feel free to re-open it again if you have additional questions!
Thanks for your great work!
I am reading paper. In section 3.3 said " To recover the remaining 2 DoFs, scale s and in-plane rotation α, we train deep networks to directly regress these values from a single 2D-2D correspondence. Since the feature extractor Fae is invariant to in-plane rotation and scaling, the corresponding features cannot be used to regress those values, hence we have to train another feature extractor we call Fist".
why? A simple Image regestring method based on SIFT descriptors, and SIFT descriptors is invariant to in-plane rotation and scaling.