How does dust3r use known camera poses in global alignment?

naver / dust3r

DUSt3R: Geometric 3D Vision Made Easy

Other

5.28k stars 576 forks source link

Hello,

I'm testing dust3r with the Lego dataset using known camera poses to determine if this improves the (world-coordinate) pointmap generation during the global alignment optimization process. I've successfully loaded and set the camera poses using preset_pose (as mentioned in #30), but I'm unsure if this actually improves or even affects the global alignment optimization process (equation (5)).

In preset_pose, when known poses are provided, it updates self.im_poses:

self._no_grad(self._set_pose(self.im_poses, idx, torch.tensor(pose), force=True))

However, during compute_global_alignment, self.pw_poses, which are randomly initialized, are used and updated:

def forward(self):
    pw_poses = self.get_pw_poses()  # cam-to-world
    pw_adapt = self.get_adaptors().unsqueeze(1)
    proj_pts3d = self.get_pts3d(raw=True)

Also, It appears that self.im_poses is used to position the cameras in the GUI. (Not sure though)

I'm confused about why the code has separate attributes called self.pw_poses instead of using self.im_poses, given that pairwise poses seem to be a subset of self.im_poses from my understanding.

Could you please clarify:

Does setting known poses through preset_pose actually affect or improve the global alignment process?
Why are there separate self.pw_poses and self.im_poses attributes, and how do they interact during optimization?

Thank you

pw_poses = self.get_pw_poses() # pose per edge with scaling -> think of it as pose of the first image in the pair, but with an added scaling factor pw_adapt = self.get_adaptors().unsqueeze(1) # small correction to compensate for bad estimated intrinsics aligned_pred_i = geotrf(pw_poses, pw_adapt * self._stacked_pred_i) # predictions for the first image in the pair, rescaled, in world space aligned_pred_j = geotrf(pw_poses, pw_adapt * self._stacked_pred_j) # predictions for the second image in the pair, rescaled, in world space (pred_j is in the same space as pred_i) proj_pts3d = self.get_pts3d(raw=True) # proj_pts3d[self._ei] = geotrf(im_poses[self._ei], rel_ptmaps) # optimized pointmaps, converted to world space using im_poses li = self.dist(proj_pts3d[self._ei], aligned_pred_i, weight=self._weight_i).sum() / self.total_area_i # distance between the predictions and optimized depthmaps that are now in the same space lj = self.dist(proj_pts3d[self._ej], aligned_pred_j, weight=self._weight_j).sum() / self.total_area_j return li + lj

naver / dust3r

How does dust3r use known camera poses in global alignment? #143