Tangshitao / MVDiffusion

MVDiffusion: Enabling Holistic Multi-view Image Generation with Correspondence-Aware Diffusion, NeurIPS 2023 (spotlight)
447 stars 21 forks source link

Question about Panorama Homography Matrix Computation #26

Closed iaoqian closed 8 months ago

iaoqian commented 8 months ago

Hi there, I'm confused for the homography matrix computation in function pano/utils/get_correspondence, which looks like:

# ...
R_left = R_left.reshape(-1, 3, 3)
R_right = R_right.reshape(-1, 3, 3)
K_left = K_left.reshape(-1, 3, 3)
K_right = K_right.reshape(-1, 3, 3)

homo_l = (K_right@torch.inverse(R_right) @
        R_left@torch.inverse(K_left))

xyz_l = torch.tensor(get_x_2d(img_h, img_w),
                    device=R.device)
xyz_l = (
    xyz_l.reshape(-1, 3).T)[None].repeat(homo_l.shape[0], 1, 1)
# ...

As far as I understand that this part of code is computing a homography matrix based on camera K & R since you name it as 'homo_l', and utilize it as a homography as well in following codes. But I didn't understand why it is computed in this way, as I find something similar but actually different in Stack Overflow - Compute Homography Matrix based on intrinsic and extrinsic camera parameters.

According to that, formulation of computing homography from Cam_2 to Cam_1 is H = K2 * R_2_1 * inv(K1) where R_2_1 = R_2_0 * R_1_0.transposed.

That's different with your version. So could you can provide some references to your formulation or just simply explain it?

Tangshitao commented 8 months ago

I didn't see any difference between my implementation and the one you refer

zhshi0816 commented 7 months ago

I think there exists a minor issue. I think it should be homo_l = (K_right @ R_right @ torch.inverse(R_left) @torch.inverse(K_left)) instead of your current implementation.

For example: Take a pixel in camera 1: p_1 = (x, y, 1) in homogeneous coordinates Back project it into a ray in 3D space: P_1 = inv(K_1) p_1 Decompose the ray in the coordinates of camera 2: P_2 = R_2_1 P1 Project the ray into a pixel in camera 2: p_2 = K_2 P_2 Put the equations together: p_2 = [K_2 R_2_1 inv(K_1)] p_1 and R_2_1 should be R_2 * R_1.transposed

Do you agree?