arthurchen0518 / DirectionNet

Wide-Baseline Relative Camera Pose Estimation with Directional Learning (CVPR 2021)
MIT License
35 stars 4 forks source link

Can you provide log files? #1

Closed AbyssGaze closed 2 years ago

AbyssGaze commented 2 years ago

Hello, I implemented your algorithm with Pytorch and then trained it on megadepth dataset, but even if we train translation under groundtruth rotation, the corresponding distribution loss is still very large. I don’t know if there is a problem with my implementation.

If it is convenient for you, can you provide the log file of your training at that time? Thank you image

arthurchen0518 commented 2 years ago

I'll contact my collaborators at Google to release the Tensorboard log. Meanwhile, do you visualize the input images that are transformed by the ground truth rotation as well as the output distribution as heatmaps?

AbyssGaze commented 2 years ago
Thanks for your reply. I visualized the data after part of the half-rotation warp. And the first column is the original graph pair, and the second column is the visualization result after the warp. scene1 scene2
8068119940_2afd92b93f_b jpg-3793783588_37730ed071_o 00182 jpg-00021
arthurchen0518 commented 2 years ago

You can try a straightforward sanity check. After you de-rotate the image pair, if the translation direction is within the field of view of the images (for example in the scenes you showed here, the camera seems to move into the view direction), you could plot the epipoles in both images and they should locate at the same position relative to the scenes. The epipoles are essentially the focus of expansion.

On Mon, Apr 18, 2022 at 10:46 PM YingChen @.***> wrote:

Thanks for your reply. I visualized the data after part of the half-rotation warp. And the first column is the original graph pair, and the second column is the visualization result after the warp. scene1 scene2 [image: 8068119940_2afd92b93f_b jpg-3793783588_37730ed071_o] https://user-images.githubusercontent.com/14217667/163909452-73affe9d-d450-4a0f-8ec3-9251bac74e61.jpg [image: 00182 jpg-00021] https://user-images.githubusercontent.com/14217667/163909601-a176ae54-c887-4946-8002-ee97a9eab168.jpg

|

— Reply to this email directly, view it on GitHub https://github.com/arthurchen0518/DirectionNet/issues/1#issuecomment-1101938316, or unsubscribe https://github.com/notifications/unsubscribe-auth/ARC2NXYT2KYJIDQCS3PDHSDVFYM6NANCNFSM5TVYPGLA . You are receiving this because you commented.Message ID: @.***>

AbyssGaze commented 2 years ago
I started by checking the corresponding pose through the epipolar line, and the visualization results seemed to be fine. The only possible problem is that the rotation of some scenes is relatively large, and even if half rotation is used, the warp may come out of the image. Maybe I need draw the rotated images epipolar visualization.Thanks. epipolar_viz1 epipolar_viz2
2749314083_ea59536d63_o jpg-8959382653_c81e98489f_o 4759024000_b24b5ab5bf_o jpg-3593778602_cbc747c220_o
arthurchen0518 commented 2 years ago

Yes, this technique is limited when the rotation is extremely large (e.g. cameras facing opposite directions). You can also warp the images to a larger FoV to mitigate the problem. Even if some part of the images is warped out of the view (actually quite common in the MatterportB experients in the paper), it may hinder the performance but still work.

On Tue, Apr 19, 2022 at 3:22 AM YingChen @.***> wrote:

I started by checking the corresponding pose through the epipolar line, and the visualization results seemed to be fine. The only possible problem is that the rotation of some scenes is relatively large, and even if half rotation is used, the warp may come out of the image. Maybe I need draw the rotated images epipolar visualization.Thanks. epipolar_viz1 epipolar_viz2 [image: 2749314083_ea59536d63_o jpg-8959382653_c81e98489f_o] https://user-images.githubusercontent.com/14217667/163947667-88a8c878-72bb-42f7-b2f9-18343368360c.jpg [image: 4759024000_b24b5ab5bf_o jpg-3593778602_cbc747c220_o] https://user-images.githubusercontent.com/14217667/163947708-da6f0741-5b57-4473-a888-9253679b04f9.jpg

— Reply to this email directly, view it on GitHub https://github.com/arthurchen0518/DirectionNet/issues/1#issuecomment-1102189302, or unsubscribe https://github.com/notifications/unsubscribe-auth/ARC2NX4BTMCSBL5AIL73ZJTVFZNL7ANCNFSM5TVYPGLA . You are receiving this because you commented.Message ID: @.***>

arthurchen0518 commented 2 years ago

Also, try visualizing the epipolar lines on the warped images which are input to the translation model. Do you also visualize the output distribution in Tensorboard during training?

On Tue, Apr 19, 2022 at 3:22 AM YingChen @.***> wrote:

I started by checking the corresponding pose through the epipolar line, and the visualization results seemed to be fine. The only possible problem is that the rotation of some scenes is relatively large, and even if half rotation is used, the warp may come out of the image. Maybe I need draw the rotated images epipolar visualization.Thanks. epipolar_viz1 epipolar_viz2 [image: 2749314083_ea59536d63_o jpg-8959382653_c81e98489f_o] https://user-images.githubusercontent.com/14217667/163947667-88a8c878-72bb-42f7-b2f9-18343368360c.jpg [image: 4759024000_b24b5ab5bf_o jpg-3593778602_cbc747c220_o] https://user-images.githubusercontent.com/14217667/163947708-da6f0741-5b57-4473-a888-9253679b04f9.jpg

— Reply to this email directly, view it on GitHub https://github.com/arthurchen0518/DirectionNet/issues/1#issuecomment-1102189302, or unsubscribe https://github.com/notifications/unsubscribe-auth/ARC2NX4BTMCSBL5AIL73ZJTVFZNL7ANCNFSM5TVYPGLA . You are receiving this because you commented.Message ID: @.***>

AbyssGaze commented 2 years ago
It seems that there is indeed a problem with the relative pose after the warp。 scene1 scene2 scene3
4759024000_b24b5ab5bf_o jpg-3593778602_cbc747c220_o 2682093514_e101645d12_o jpg-2642916267_430e409c59_o 4014258223_72eba5e2b4_o jpg-3948471254_2fe88cdc83_o
def visualization_half_rotation_epipolar(batch, output_path):
    relative_pose = batch["relative_pose"]

    # warp images
    H = batch["H"]
    H_pred_T = torch.inverse(H)
    transformed_src = homography_warp(batch["image1"], H_pred_T)
    transformed_trt = homography_warp(batch["image2"], H)

    # Origin images matches
    mkpts1, mkpts2 = dense_matching(
        batch["depth1"][0].numpy(),
        batch["intrinsics1"][0].numpy(),
        batch["depth2"][0].numpy(),
        batch["intrinsics2"][0].numpy(),
        relative_pose[0].numpy(),
    )
    index = random.sample(range(0, mkpts1.shape[0]), min(20, mkpts1.shape[0]))
    mkpts1 = mkpts1[index, :]
    mkpts2 = mkpts2[index, :]
    # Half rotation warp matches
    warp_mkpts1 = (
        H[0].numpy() @ np.c_[mkpts1, np.ones(mkpts1.shape[0])].transpose(1, 0)
    ).transpose(1, 0)
    warp_mkpts1 = (warp_mkpts1[:, :2] / warp_mkpts1[:, 2:]).astype(np.int64)
    warp_mkpts2 = (
        H_pred_T[0].numpy() @ np.c_[mkpts2, np.ones(mkpts2.shape[0])].transpose(1, 0)
    ).transpose(1, 0)
    warp_mkpts2 = (warp_mkpts2[:, :2] / warp_mkpts2[:, 2:]).astype(np.int64)
    colors = [
        tuple(np.random.choice(range(256), size=3).astype(int))
        for _ in range(min(20, mkpts1.shape[0]))
    ]
    # half rotation warp with fundamental matrix
    half_R = half_rotation(relative_pose[:, :3, :3])
    translation_gt = torch.matmul(
        torch.inverse(half_R), relative_pose[:, :3, 3].unsqueeze(-1)
    ).squeeze(-1)
    half_relative_pose = torch.eye(4).repeat(relative_pose.shape[0], 1, 1)
    half_relative_pose[:, :3, :3] = half_R
    half_relative_pose[:, :3, 3] = translation_gt

    # Visualization warp images
    E = pose_to_fundamental(half_relative_pose[0].numpy())
    F = (
        np.linalg.inv(batch["intrinsics2"][0].numpy()).T
        @ E
        @ np.linalg.inv(batch["intrinsics1"][0].numpy())
    )
    warp_viz = make_epipolar_plot_fast(
        transformed_src[0].permute(1, 2, 0).contiguous().numpy() * 255,
        transformed_trt[0].permute(1, 2, 0).contiguous().numpy() * 255,
        warp_mkpts1,
        warp_mkpts2,
        warp_mkpts1,
        warp_mkpts2,
        F,
        colors,
        [],
        None,
    )

    # Visualization origin images
    E = pose_to_fundamental(relative_pose[0].numpy())
    F = (
        np.linalg.inv(batch["intrinsics2"][0].numpy()).T
        @ E
        @ np.linalg.inv(batch["intrinsics1"][0].numpy())
    )
    origin_viz = make_epipolar_plot_fast(
        batch["image1"][0].permute(1, 2, 0).contiguous().numpy() * 255,
        batch["image2"][0].permute(1, 2, 0).contiguous().numpy() * 255,
        mkpts1,
        mkpts2,
        mkpts1,
        mkpts2,
        F,
        colors,
        [],
        None,
    )
    viz = cv2.vconcat([origin_viz, warp_viz])
    cv2.imwrite("{}/{}".format(output_path, batch["file_name"][0]), viz)
arthurchen0518 commented 2 years ago

Maybe try a simpler scenario that just keeps the source image still and derotate only the target image. See if this test gives an insight into finding the bug.