zju3dv / EfficientLoFTR

Apache License 2.0
635 stars 48 forks source link

Fine Preprocess feature size. #11

Closed AliYoussef97 closed 7 months ago

AliYoussef97 commented 7 months ago

Hello!

I had a brief question regarding image_1's fine features' dimention, in particular the addtion of +2 when unfolding the local windows here. I fail to understand the reasoning behind the +2, as along the pipeline conf_matrix_ff has a size of [M,W**2, (W+2)**2] here. Although softmax_matrix_f does become [M,WW,WW], conf_matrix_ff is stored as [M,W**2, (W+2)**2].

Would really appreciate if you could provide an explanation for the +2.

Thank you!

wyf2020 commented 7 months ago

Sorry for the late reply. This is a great question! The "+2" is due to our two-stage refinement process. In the first stage, we crop non-overlapping 8x8 fine windows from both images to compute the argmax confidence. In the second stage, we then crop 3x3 local windows around the argmax confidence areas in the right image. If the argmax from the first stage is located at the edge of an 8x8 window, the second stage needs to crop 3x3 local windows from 10x10 fine windows in the right image. Since the 8x8 fine windows do not overlap, we cannot adjust by padding zero in the 3x3 local windows or discarding edge argmaxes. Therefore, in the fine preprocess, we unfold 10x10 fine windows in the right image in advance for use in the second stage.

AliYoussef97 commented 7 months ago

@wyf2020 Thank you so much this makes so much sense!

Just for clarification, $8 \times 8$ window softmax_matrix_f- > get_fine_ds_match to get the the keypoints from $8 \times 8$ window. Following that, keypoints -> get $3 \times 3$ local window from conf_matrix_ff centered around the pixel-level keypoints. I hope my understanding is correct?

wyf2020 commented 7 months ago

Yes, exactly! Your understanding is correct.

AliYoussef97 commented 7 months ago

@wyf2020 Thank you so much!