marek-simonik / record3d

Accompanying library for the Record3D iOS app (https://record3d.app/). Allows you to receive RGBD stream from iOS devices with TrueDepth camera(s).
https://record3d.app/
GNU Lesser General Public License v2.1
383 stars 57 forks source link

Few Questions Regarding Intrinsic and Extrinsic Parameters #31

Closed zehuiz2 closed 2 years ago

zehuiz2 commented 2 years ago

First, I must say this is a great App and I would like to thank the developer for being responsive! I have some questions regarding the intrinsic and extrinsic parameters:

  1. Is the depth image and RGB image already fused? Do I need to estimate the rigid transformation between the lidar and the RGB camera?
  2. Do I need to calibrate the RGB/Depth images in terms of radial and tangential distortions?
  3. I have difficulty understanding the poses matrix provided in metadata. From previous posts, I think they are extrinsic parameters. I assume each row corresponds to each image. There are seven parameters in each row. How could extrinsic parameters have only 7 elements? I guess it should have 12?

I'm looking forward to hearing from you!

areiner222 commented 2 years ago

I am another enthusiastic user of this app - working great for me!

@zehuiz2 on question 3: extrinsic parameters - take a look at this issue. Also, I believe each row corresponds to each image.

zehuiz2 commented 2 years ago

I am another enthusiastic user of this app - working great for me!

@zehuiz2 on question 3: extrinsic parameters - take a look at this issue. Also, I believe each row corresponds to each image.

Thank you! It really helped.

zehuiz2 commented 2 years ago

One follow-up question on question 3. Could I understand it this way:

  1. The quaternion could be used to restore the 3x3 rotation matrix R. Just to confirm, the elements are ordered as: qw, qx, qy, qz?
  2. The world pose is actually the translation vector t. Elements are ordered as tx, ty, tz?
  3. With intrinsic parameters K gave, world/camera coordinate transformation is given: X{cam} = K [R|t] X{world}
marek-simonik commented 2 years ago

I will first answer your original questions:

  1. It should be fused internally by Apple.
  2. Similar to 1., Apple should have already taken care of distortion correction.
  3. As @areiner222 correctly answered with the link, the poses are stored as a quaternion + world pose.

To answer your follow-up questions:

  1. The quaternion is stored as qx, qy, qz, qw.
  2. Yes, exactly.
  3. I think it should be X_{world} = [R|t] K X_{cam}.
zehuiz2 commented 2 years ago

Thank you!