xiexh20 / behave-dataset

Code to access BEHAVE dataset, CVPR'22
https://virtualhumans.mpi-inf.mpg.de/behave/
Other
141 stars 6 forks source link

Confusion regarding extrinsic params #4

Closed mnauf closed 2 years ago

mnauf commented 2 years ago

The supplementary paper states that:

"We use checkerboard to calibrate the relative poses between different kinects in a pairwise manner. Specifically, we capture 20 pairs of RGB-D images from two kinects and then register each color image with corresponding depth image such that they have the same resolution. We then use OpenCV to extract the checkerboard corners in the color images and obtain their 3D camera coordinates utilizing the registered depth map. Finally, we perform a Procrustes registration on these ordered 3D checkerboard corners to obtain the relative transformation between two kinects. We obtain 3 pairs of relative transformation for 4 kinects and combine them to compute the transformation under a common world coordinate."

I was hoping that extrinsic parameters are the position of a camera with respect to the world coordinates of checkerboard, but it looks like I got it wrong. Reading the supplementary paper statement about how extrinsic params were obtained confused me even more.

Q1: Can you please explain what the extrinsics for each camera represent in this paper? Are these relative to cam1? because the rotation of cam1 is the identity matrix and the translation matrix is a zero matrix. But what does it mean "We obtain 3 pairs of relative transformation for 4 kinects and combine them to compute the transformation under a common world coordinate"?

{
  "rotation": [
    1.0,
    0.0,
    0.0,
    0.0,
    1.0,
    0.0,
    0.0,
    0.0,
    1.0
  ],
  "translation": [
    0.0,
    0.0,
    0.0
  ]
}

Q2: Are depth and color images already aligned or do I need to transform coordinates of depth image to color camera coordinate system?

xiexh20 commented 2 years ago

hi, Q1: the extrinsic are the poses for relative camera transformation. In our case, we store the local camera to world transformation in the config files. The world coordinate is set to be the camera coordinate of kinect 1, hence it is identity in the file.

Q2. Yes, the depth and color images are aligned, they are both in color camera space.

On Wed, 18 May 2022 at 14:28, Muhammad Naufil @.***> wrote:

The supplementary paper states that:

"We use checkerboard to calibrate the relative poses between different kinects in a pairwise manner. Specifically, we capture 20 pairs of RGB-D images from two kinects and then register each color image with corresponding depth image such that they have the same resolution. We then use OpenCV to extract the checkerboard corners in the color images and obtain their 3D camera coordinates utilizing the registered depth map. Finally, we perform a Procrustes registration on these ordered 3D checkerboard corners to obtain the relative transformation between two kinects. We obtain 3 pairs of relative transformation for 4 kinects and combine them to compute the transformation under a common world coordinate."

I was hoping that extrinsic parameters are the position of a camera with respect to the checkerboard, but it looks like I got it wrong. Reading the supplementary paper statement about how extrinsic params were obtained confused me even more.

Q1: Can you please explain what do the extrinsics for each camera represent in this paper and why the extrinsics of behave\calibs\Date01\config\1 are:

{ "rotation": [ 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0 ], "translation": [ 0.0, 0.0, 0.0 ] }

Q2: Are depth and color images already aligned or do I need to transform coordinates of depth image to color camera coordinate system?

— Reply to this email directly, view it on GitHub https://github.com/xiexh20/behave-dataset/issues/4, or unsubscribe https://github.com/notifications/unsubscribe-auth/AK65CP7XEV46RVPG6XHXUHLVKTO5VANCNFSM5WIJS2IA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

mnauf commented 2 years ago

Thanks @xiexh20. Does that mean if I want to transfer coordinates of cam0 to cam2, I will first transfer coordinates of cam0 to cam3 using cam0 extrinsincs, and then transfer coordinates from cam3 coordinate system to cam2 coordinate system using inverse of cam2 intrinsics?

In other words, cam0 extrinsincs allow me to move from cam0 to cam3 (0 -> 3) and then inverse of cam2 extrinsics allow me to move from cam3 to cam2 (3 -> 2)

(0 -> 3 -> 2)

Is that correct?

xiexh20 commented 2 years ago

yes, the overall idea is correct, except here you should do 0->1->2, because we use camera 1 as the world coordinate.

xiexh20 commented 2 years ago

the transformations between cameras are wrapped in this class: https://github.com/xiexh20/behave-dataset/blob/main/data/kinect_transform.py you can play around with it

mnauf commented 2 years ago

@xiexh20 oh yeah yeah. That's what I meant. Thanks loads