NVlabs / dex-ycb-toolkit

A Python package that provides evaluation and visualization tools for the DexYCB dataset
https://dex-ycb.github.io
GNU General Public License v3.0
145 stars 24 forks source link

Orientation of world coordinate system #10

Closed christsa closed 2 years ago

christsa commented 2 years ago

Hi

I have a question about the orientation of the world coordinate system. When running the visualizer via the view_sequence.py file, the entire scene seems to be tilted with regards to the xyz world-coordinates displayed (see attached image). Is there any way to retrieve the orientation of the "scene", e.g., the table orientation with regards to world coordinates?

Thanks for your reply!

dexycb_vis

ychao-nvidia commented 2 years ago

Yes, we do have the table frame. This was obtained by placing an AprilTag on the surface of the table during calibration.

Currently the extrinsics are computed with respect to the frame of one camera, which is also the world coordinates in the visualizer. Below is an example of how to transform the extrinsics to make them with respect to the table frame (i.e. setting the world coordinates to the table frame).

First, make the following change from this line. This will update the extrinsics and change the frame of the point cloud.

     self._R = [T[s][:, :3] for s in self._serials]
     self._t = [T[s][:, 3] for s in self._serials]
+    tag_R_inv = torch.inverse(T['apriltag'][:, :3])
+    tag_t_inv = torch.mv(tag_R_inv, -T['apriltag'][:, 3])
+    for c in range(self._num_cameras):
+      self._R[c] = torch.mm(tag_R_inv, self._R[c])
+      self._t[c] = torch.addmv(tag_t_inv, tag_R_inv, self._t[c])
     self._R_inv = [torch.inverse(r) for r in self._R]
     self._t_inv = [torch.mv(r, -t) for r, t in zip(self._R_inv, self._t)]
     self._master_intrinsics = self._K[[
         i for i, s in enumerate(self._serials) if s == extr['master']
     ][0]].cpu().numpy()
-    self._tag_R = T['apriltag'][:, :3]
-    self._tag_t = T['apriltag'][:, 3]
+    self._tag_R = torch.eye(3, dtype=torch.float32, device=self._device)
+    self._tag_t = torch.zeros(3, dtype=torch.float32, device=self._device)
     self._tag_R_inv = torch.inverse(self._tag_R)
     self._tag_t_inv = torch.mv(self._tag_R_inv, -self._tag_t)
     self._tag_lim = [-0.00, +1.20, -0.10, +0.70, -0.10, +0.70]

Next, you need to also update the pose of ycb and mano.

For ycb, make the following change from this line:

       ycb_pose = data['pose_y']
       i = np.any(ycb_pose != [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0], axis=2)
       pose = ycb_pose.reshape(-1, 7)
+      q = pose[:, :4]
+      t = pose[:, 4:]
+      R = Rot.from_quat(q).as_dcm().astype(np.float32)
+      R = torch.from_numpy(R).to(self._device)
+      t = torch.from_numpy(t).to(self._device)
+      R = torch.bmm(tag_R_inv.expand(R.size(0), -1, -1), R)
+      t = torch.addmm(tag_t_inv, t, tag_R_inv.t())
+      R = R.cpu().numpy()
+      t = t.cpu().numpy()
+      q = Rot.from_dcm(R).as_quat().astype(np.float32)
+      pose = np.hstack([q, t])
       v, n = self.transform_ycb(pose)
       self._ycb_vert = [
           np.zeros((self._num_frames, n, 3), dtype=np.float32)

For mano, make the following change from this line:

       data = np.load(mano_file)
       mano_pose = data['pose_m']
       i = np.any(mano_pose != 0.0, axis=2)
+      root_trans = self._mano_group_layer.root_trans.cpu().numpy()
+      mano_pose = mano_pose.reshape(-1, 51)
+      pose = mano_pose[:, np.r_[0:3, 48:51]]
+      pose[:, 3:] += root_trans
+      r = pose[:, :3]
+      t = pose[:, 3:]
+      r = torch.from_numpy(r).to(self._device)
+      t = torch.from_numpy(t).to(self._device)
+      R = rv2dcm(r)
+      R = torch.bmm(tag_R_inv.expand(R.size(0), -1, -1), R)
+      t = torch.addmm(tag_t_inv, t, tag_R_inv.t())
+      r = dcm2rv(R)
+      r = r.cpu().numpy()
+      t = t.cpu().numpy()
+      pose = np.hstack([r, t])
+      pose[:, 3:] -= root_trans
+      mano_pose[:, np.r_[0:3, 48:51]] = pose
       pose = torch.from_numpy(mano_pose).to(self._device)
       pose = pose.view(-1, self._mano_group_layer.num_obj * 51)
       verts, _ = self._mano_group_layer(pose)

Now run:

python examples/view_sequence.py --name 20200709-subject-01/20200709_141754

The point cloud and models should now be in the table frame:

Screenshot from 2021-09-01 08-07-06

christsa commented 2 years ago

Hi

Thanks a lot for the quick response! Everything worked out as you described after applying the changes.