ActiveVisionLab / DFNet

DFNet: Enhance Absolute Pose Regression with Direct Feature Matching (ECCV 2022)
https://dfnet.active.vision
MIT License
91 stars 8 forks source link

Question about the pipline in the paper #23

Closed LZL-CS closed 2 months ago

LZL-CS commented 5 months ago

Hi @chenusc11, I am confused why here you use GT Pose P as input rather than use Predicted Pose from the PoseNet model as input to NeRF? image

While here you seem to use Predicted Pose from the PoseNet model as input to NeRF: image

chenusc11 commented 2 months ago

Hi, really sorry for missing this thread. I wasn't aware of this issue until now. I apologize for this.

Figure 2b illustrates a high-level scheme to train the DFNet features. In the training stage, we want to make sure that features of real images and synthetic images (from NeRF) from the same poses are as close as possible; and features of real and synthetic images from the different poses to be as far as possible. That is why we input GT poses to do the first task and roll the GT poses to do the second task. This can be found in our triplet loss implementation in the code.

LZL-CS commented 2 months ago

Hi, really sorry for missing this thread. I wasn't aware of this issue until now. I apologize for this.

Figure 2b illustrates a high-level scheme to train the DFNet features. In the training stage, we want to make sure that features of real images and synthetic images (from NeRF) from the same poses are as close as possible; and features of real and synthetic images from the different poses to be as far as possible. That is why we input GT poses to do the first task and roll the GT poses to do the second task. This can be found in our triplet loss implementation in the code.

Hi @chenusc11 Thanks for your reply, I got it. I am closing this issue now.