rust-cv / pnp

Perspective-n-Point algorithm
MIT License
2 stars 1 forks source link

Regarding vocabulary #1

Open mpizenberg opened 4 years ago

mpizenberg commented 4 years ago

One of the most frustrating things in CV is in my opinion swapping by mistake projection matrices and their inverse. Using a consistent vocabulary I think is thus quite important.

Usually the image formation process is modeled as x = K * T * (X) where T is the transformation due to the movement (rigid body motion of the camera) and K is intrinsics matrix. This matrix T is thus the conversion from world coordinates to camera coordinates. Therefore, in most algorithms, the objective is to find the rotation and translation composing this matrix T = [ R, t ].

Sometimes however we use the vocabulary "camera pose", which are used to describe the rotation and the translation of the camera itself in world coordinates. If we note C (for "camera") this "pose", we can note that C is actually the inverse of that camera projection matrix. C = inv(T).

In the manual test, there is usage of camera "pose" where actually it is not a pose but transformation matrix (the inverse). I just wanted to point it out since I was surprised when reading the line:

let world_points = camera_depth_points.map(|p| pose.inverse() * p);
vadixidav commented 4 years ago

I actually created separate types called "CameraPose" and "WorldPose" in cv-core. To reflect this terminology, I will rename the variables here to be called world_pose.