Parskatt / DKM

[CVPR 2023] DKM: Dense Kernelized Feature Matching for Geometry Estimation
https://parskatt.github.io/DKM/
Other
378 stars 28 forks source link

e_R reaches 180° #33

Closed Moreland-cas closed 1 year ago

Moreland-cas commented 1 year ago

I noticed that the estimation error of the rotation angle of the model for quite a lot images reached a maximum of 180°. Do you have any idea what might be the reason for this? 55

Moreland-cas commented 1 year ago

At the same time, the points sampled by the model have a chance to produce strange slashes as depicted in the picture

Parskatt commented 1 year ago

Can tou provide the exact image pair? I have not noticed that bug before.

Moreland-cas commented 1 year ago

Can tou provide the exact image pair? I have not noticed that bug before.

Sure! Please check the following pairs from megadepth loftr test set: /megadepth/Undistorted_SfM/0022/images/3216263284_1c2f358e5a_o.jpg /megadepth/Undistorted_SfM/0022/images/624570534_077746f40c_o.jpg

/megadepth/Undistorted_SfM/0022/images/3715689873_8b49e3676d_o.jpg /megadepth/Undistorted_SfM/0022/images/1390794615_4e2efcc84d_o.jpg

/megadepth/Undistorted_SfM/0022/images/2676532132_7c55a4b43a_o.jpg /megadepth/Undistorted_SfM/0022/images/2899024525_d3c7b3b33c_o.jpg

/megadepth/Undistorted_SfM/0022/images/1469603948_0052cdbe5d_o.jpg /megadepth/Undistorted_SfM/0022/images/3722427422_de83a590d7_o.jpg

/megadepth/Undistorted_SfM/0022/images/2142521640_9a08bee026_o.jpg /megadepth/Undistorted_SfM/0022/images/2377284574_dcf50dc5c7_o.jpg

/megadepth/Undistorted_SfM/0022/images/453068332_a802fc9422_o.jpg /megadepth/Undistorted_SfM/0022/images/325288797_537a42bb6f_o.jpg

/megadepth/Undistorted_SfM/0015/images/2087061093_e736a18497_o.jpg /megadepth/Undistorted_SfM/0015/images/3305213780_e460e68085_o.jpg

/megadepth/Undistorted_SfM/0015/images/143983789_f030e8f660_o.jpg /megadepth/Undistorted_SfM/0015/images/2960579238_7ba551628b_o.jpg

/megadepth/Undistorted_SfM/0015/images/3602433917_f463ce87ae_o.jpg /megadepth/Undistorted_SfM/0015/images/3159096518_744c87899e_o.jpg

Moreland-cas commented 1 year ago

3 5 7 23 26 46 48 50 51

Moreland-cas commented 1 year ago

the green dots are pairs pruduced by dkm_model.sample() with symmertic=false The blue point in the right picture is the correct correspondence of the green point in the left picture, and this correspondence is calculated by applying depth and internal parameters

Parskatt commented 1 year ago

Thanks, I'll check it out.

Parskatt commented 1 year ago

The dense warp seems to have no issue, might be something wrong with the sampling since I changed recently from np.random.choice -> torch.multinomial. Checking.

Parskatt commented 1 year ago

I can not reproduce. Could it be that what you are plotting are inliers after RANSAC? It might be the case that the estimation converges to a degenerate case with the camera being mirrored? (saying this because it seems like the translation error is low).

It seems like the points are approximately lying on a plane. If you estimated a fundamental matrix, there might be degenerate cases.

Parskatt commented 1 year ago

Could you share with me what estimator you used? Was it the opencv estimator used for mega-1500?

Moreland-cas commented 1 year ago

Could you share with me what estimator you used? Was it the opencv estimator used for mega-1500?

I first estimated the fundamental matrix, then the Essential matrix, and calculated R and t. here is the function I used to calculate F and E: """ def calculate_F(dkm_model, pair_data): image1_path = pair_data["image1_path"] image2_path = pair_data["image2_path"]

W_A, H_A = Image.open(image1_path).size
W_B, H_B = Image.open(image2_path).size

warp_est = pair_data["warp_est"]
warp_certainty_est = pair_data["warp_certainty_est"]

# Sample matches for estimation
sample_matches, sample_certainty = dkm_model.sample(warp_est, warp_certainty_est, num=5000)

kpts1, kpts2 = dkm_model.to_pixel_coordinates(sample_matches, H_A, W_A, H_B, W_B)    
# calculate fundamental matrix
F_est, mask = cv2.findFundamentalMat(
    kpts1.cpu().numpy(), 
    kpts2.cpu().numpy(), 
    ransacReprojThreshold=0.2, 
    method=cv2.USAC_MAGSAC, 
    confidence=0.999999, 
    maxIters=10000
)
# print("percentage of inliers: ", 100 * mask.mean())
# save kpts1, kpts2, mask, certainty, F_est to pair_data
pair_data["F_est"] = F_est
pair_data["kpts1"] = kpts1.cpu().numpy()
pair_data["kpts2"] = kpts2.cpu().numpy()
pair_data["kpts_mask"] = mask
pair_data["sample_certainty"] = sample_certainty.cpu().numpy()

def calculate_E(pair_data):

F_est = pair_data["F_est"]
K1 = pair_data["K1"]
K2 = pair_data["K2"]
kpts1 = pair_data["kpts1"]
kpts2 = pair_data["kpts2"]
mask = pair_data["kpts_mask"]

# calculate E_est
E_est = K2.T @ F_est @ K1

_, R_est, t_est, _ = cv2.recoverPose(E_est, kpts1, kpts2, mask=mask)
pair_data["R_est"] = R_est
pair_data["t_est"] = t_est

"""

Moreland-cas commented 1 year ago

Could you share with me what estimator you used? Was it the opencv estimator used for mega-1500?

I found that the estimation error could be very different for different sample (using same sampling method) , the difference could reach 5°, is this normal? Is there any way to make the camera pose estimation more stable?

Moreland-cas commented 1 year ago

I can not reproduce. Could it be that what you are plotting are inliers after RANSAC? It might be the case that the estimation converges to a degenerate case with the camera being mirrored? (saying this because it seems like the translation error is low).

It seems like the points are approximately lying on a plane. If you estimated a fundamental matrix, there might be degenerate cases.

That's right, points plotted are only inliers

Parskatt commented 1 year ago

I can not deeply investigate (lack of time), but I'm guessing that it might be the case that the very low inlier threshold for these images might create issues. Megadepth has issues with incorrect intrinsics, meaning that some pairs will inherently not match perfectly. When you enforce a very strict threshold these types of degenerate solutions may appear. One easy check would be to see if R is in the wrong direction (e.g. points behind the camera etc.). If you have K you could estimate E directly as well.

Moreland-cas commented 1 year ago

that the very low inlier threshold Well It already help a lot! Great thanks Just one more question, Do you think the performance of the network on the megadepth dataset already being saturated?

Parskatt commented 1 year ago

I think performance on Megadepth can increase a lot in terms of warp accuracy (matching correctly), but due to the "ground truth" being noisy, I think the geometry estimation accuracy may be starting to saturate. Probably a "perfect" method would reach something like 65 AUC@5 with the current ransac estimator.

I'll close the issue :)