Open kongbia opened 3 days ago
Sorry for the late reply.
It's actually the world-to-camera matrix. Thus, predicted/optimized poses as well as GT poses from 3DGS cameras are presented in w2c format. In this case, the matrix was given the wrong name, I will fix it soon.
Thanks for noticing!
Sorry for the late reply.
It's actually the world-to-camera matrix. Thus, predicted/optimized poses as well as GT poses from 3DGS cameras are presented in w2c format. In this case, the matrix was given the wrong name, I will fix it soon.
Thanks for noticing!
Thanks for your reply, I still have a minor question.
In compute_warping_loss
function
def compute_warping_loss(vr, qr, quat_opt, t_opt, pose, K, depth):
warp = pose @ from_cam_tensor_to_w2c(torch.cat([quat_opt, t_opt], dim=0)).inverse()
warped_image = differentiable_warp(vr.unsqueeze(0), depth.unsqueeze(0), warp.unsqueeze(0), K.unsqueeze(0))
loss = F.mse_loss(warped_image, qr.unsqueeze(0))
return loss
warp
first implements the c2w back-projection via initial pose pose
, and then implements w2c projection via optimized pose quat_opt, t_opt
. Howver, the pose
in the code represents w2c, while the from_cam_tensor_to_w2c(torch.cat([quat_opt, t_opt], dim=0)).inverse()
represents c2w, which seems not to be consistent with the principle.
Sorry for the late reply. It's actually the world-to-camera matrix. Thus, predicted/optimized poses as well as GT poses from 3DGS cameras are presented in w2c format. In this case, the matrix was given the wrong name, I will fix it soon. Thanks for noticing!
Thanks for your reply, I still have a minor question. In
compute_warping_loss
functiondef compute_warping_loss(vr, qr, quat_opt, t_opt, pose, K, depth): warp = pose @ from_cam_tensor_to_w2c(torch.cat([quat_opt, t_opt], dim=0)).inverse() warped_image = differentiable_warp(vr.unsqueeze(0), depth.unsqueeze(0), warp.unsqueeze(0), K.unsqueeze(0)) loss = F.mse_loss(warped_image, qr.unsqueeze(0)) return loss
warp
first implements the c2w back-projection via initial posepose
, and then implements w2c projection via optimized posequat_opt, t_opt
. Howver, thepose
in the code represents w2c, while thefrom_cam_tensor_to_w2c(torch.cat([quat_opt, t_opt], dim=0)).inverse()
represents c2w, which seems not to be consistent with the principle.
Here the warp
is defined as warp = w2c @ c2w
So in the differentiable_warp
function the part
cam_points = warp @ world_points # (B, 4, H*W)
is equal to
world_points = c2w @ world_points
cam_points = w2c @ world_points # (B, 4, H*W)
I hope this helps.
In loc_inference.py, $R,t$ generated by PnP represents the world-to-camera transformation in my understanding. Why assign it to the $c2w$ matrix, which seems to represent the camera-to-world transform in
compute_warping_loss