skhadem / 3D-BoundingBox

PyTorch implementation for 3D Bounding Box Estimation Using Deep Learning and Geometry
MIT License
435 stars 96 forks source link

Question about Derivation #9

Closed cnexah closed 4 years ago

cnexah commented 4 years ago

Hello! I have a question about the derivation: In the given material (http://ywpkwon.github.io/pdf/bbox3d-study.pdf), what does K mean? I think it is intrinsic matrix, but why its shape is 3 * 4?

Thank you very much!

skhadem commented 4 years ago

I believe 'K' is actually the projection matrix from the base frame of the vehicle to the camera (in my code it's camera 2, so P_rect_02 in the KITTI calibration). So when you do K[R|T]X, you are taking X (the world coordinates of a point), rotating and translating it to get it to the center of robot (the car in this case), then projecting it into the camera frame (x,y) coordinates.

cnexah commented 4 years ago

Thank you for your reply! It helps me a lot. By the way, could I ask one more question about the KITTI calibration? There are two files for the camera calibration in KITTI. One is the 'calib_cam_to_cam.txt', the other one is the 'calib' folder in 'camera calibration matrices of object data set', which has a .txt file for each frame. So what's the difference?

skhadem commented 4 years ago

Yes, I remember seeing that also. I am not quite sure the difference, my guess is that they run one calibration and then they also save that calibration for each frame (why - not sure). I have seen other people use them interchangeably. I.e. I have seen some people use the calibration from cam_to_cam, and others just grab it from any frame's calibration file.

skhadem commented 4 years ago

Closing since it seems your question has been answered