Finding pixel coordinates from projected values on images

tristanguigue commented 7 years ago

Sorry if this isn't the best place to ask and if I am missing something but I was wondering what was the meaning of the values of the Velodyne point once they have been projected on the images, in particular how to translate this in term of pixel coordinate for each camera?

So when calling point_cam0 = data.calib.T_cam0_velo.dot(point_velo), I get float values that I am not sure how to interpret since I don't know their range or what is the center of the image. In particular how to get the actual pixel coordinates on each image from those values?

leeclemnet commented 7 years ago

data.calib.T_cam0_velo is the transformation matrix that takes a 3D point expressed in the coordinate system attached to the Velodyne and re-expresses it in the coordinate system attached to cam0 (see here for a diagram). So point_cam0 is just the 3D (x,y,z) coordinates of the point, expressed in the left grayscale camera's frame of reference.

To figure out the pixel coordinates of that point, you need to know how 3D coordinates in the camera frame map onto pixel coordinates, which involves some projective geometry and some properties of the camera itself. Since you're dealing with rectified images, you can use a (relatively) simple pinhole camera model to map (x,y,z) onto (u,v). The parameters of this model (usually called the "intrinsic calibration parameters" of the camera or the "K matrix") are stored in dataset.calib.K_cam0 for cam0.

The formula you can use is [u,v,1] = K * [x/z, y/z, 1]

Hope that answers your question.

tristanguigue commented 7 years ago

I see, yes very useful, thank you!

utiasSTARS / pykitti

Finding pixel coordinates from projected values on images #9