Why the extrinsic camera matrix is not explicit ?

rohitdavas commented 2 years ago

the contactpose class gives the access to the properties and information we need. however, the api gives us the projection matrix directly. After reading the code also, it is not very clear to me what could be the extrinsic matrix. I am mentioning points in code as comment below :


    def K(self, camera_name):
        """
        Camera intrinsics 3x3
        You will almost never need this. Use self.P() for projection
        """
        return self._K[camera_name]

    def A(self, camera_name):
        """
        Affine transform to be applied to 2D points after projection
        Included in self.P
        """
        # Can you comment on why we need to apply affine transform over the projection ?
        return mutils.get_A(camera_name, 960, 540)

    def P(self, camera_name, frame_idx):
        """
        3x4 3D -> 2D projection matrix
        Use this for all projection operations, not self.K

        """
        # --- >  the familiar notation to me is : K * [R|t] - for reference.
        # --- > can you explain the code ? I would like to retrieve the R, t or extrinsic matrix. 
        P = self.K(camera_name) @ self.object_pose(camera_name, frame_idx)[:3]
        P = self.A(camera_name) @ P
        return P

    def object_pose(self, camera_name, frame_idx):
        """
        Pose of obj_name w.r.t. camera at frame frame_idx
        4x4 homogeneous matrix
        """
        return self._cto[camera_name][frame_idx]

samarth-robo commented 2 years ago

@rohitdavas A is needed in addition to P in order to account for some image reflections and transposes we did on the images. The Kinects were arranged in upside-down and right-to-left positions because of mechanical constraints of the capture frame (See the figure in the paper for details), and these reflections and transposes account for that.

rohitdavas commented 2 years ago

Thanks. Now I understand. I might have skipped this part from paper. thanks for answering.

I believe then the call self.object_pose(camera_name, frame_idx)[:3] is the extrinsic matrix.

samarth-robo commented 2 years ago

that is correct @rohitdavas. But the irregularity here is that without A, this extrinsics matrix will give a wrong projection of 3D points to the image.

rohitdavas commented 2 years ago

Thanks. I got your point. Just to be clear there are three transformations now

object to camera - [R|t]
camera to screen - K
and the orientation of camera's - A

Thanks.

samarth-robo commented 2 years ago

@rohitdavas correct!

facebookresearch / ContactPose

Why the extrinsic camera matrix is not explicit ? #19