Regarding vocabulary - Githubissues

One of the most frustrating things in CV is in my opinion swapping by mistake projection matrices and their inverse. Using a consistent vocabulary I think is thus quite important.

Usually the image formation process is modeled as x = K * T * (X) where T is the transformation due to the movement (rigid body motion of the camera) and K is intrinsics matrix. This matrix T is thus the conversion from world coordinates to camera coordinates. Therefore, in most algorithms, the objective is to find the rotation and translation composing this matrix T = [ R, t ].

Sometimes however we use the vocabulary "camera pose", which are used to describe the rotation and the translation of the camera itself in world coordinates. If we note C (for "camera") this "pose", we can note that C is actually the inverse of that camera projection matrix. C = inv(T).

In the manual test, there is usage of camera "pose" where actually it is not a pose but transformation matrix (the inverse). I just wanted to point it out since I was surprised when reading the line:

let world_points = camera_depth_points.map(|p| pose.inverse() * p);

rust-cv / pnp

Regarding vocabulary #1