Open yan99033 opened 6 years ago
Hi, we used the intrinsics mentioned at the website: "In the KinectFusion pipeline we used the following default intrinsics for the depth camera: Principle point (320,240), Focal length (585,585)"
Hi Samarth,
it seems those intrinsics are for the depth camera. I assume the intrinsics for the RGB camera should be different as the provided RGB images have different field of view as the depth images. I assume the provided GT poses are also for the depth camera.
Besides, the RGB images are captured by rolling shutter camera (Kinect) which might also be problematic for DSO.
I'm relatively new to the 7Scenes dataset. It would be great if somebody seeing this could share some information on the RGB camera, e.g., are the RGB camera intrinsics and the relative pose between the RGB and depth cameras mentioned in some following up projects or papers?
Rui, you are right -- the FOVs of the depth and color images are slightly different. From my visual inspection, the focal length of the RGB camera is a little higher than that of the depth camera. Principal point seems to be the same. But we probably will never know without calibration.
If someone with more experience with DSO wants to eyeball the RGB camera's focal length and run it again, I'd be happy to see the new numbers. We probably can't do much about the rolling shutter camera.
Re-opening this issue in the hope that someone sees it.
The chess
scene actually has lots of images of a chessboard, does anyone know the physical size of those squares? 😄
I ran the matlab calibration routine against a subset of the chessboard frames and used the following product description (looks like a very similar board to me): https://www.amazon.com/WE-Games-Roll-up-Travel-Shoulder/dp/B000A5F0MI/ref=sr_1_12?dchild=1&keywords=roll+up+chess+set&qid=1604939540&sr=8-12 to obtain a square size of 44.45mm - although I understand that the intrinsics are invariant to the square size.
This resulted in the following intrinsics:
Focal length (pixels): [ 552.4015 +/- 12.2100 555.7819 +/- 11.3675 ] Principal point (pixels):[ 306.6490 +/- 2.5843 234.8164 +/- 10.9865 ] Radial distortion: [ 0.1134 +/- 0.0331 -1.5607 +/- 0.4997 5.7728 +/- 2.1257 ] Tangential distortion: [ 0.0056 +/- 0.0016 -0.0094 +/- 0.0018 ]
which I believe are plausible for a Microsoft Kinect.
In Figure 4 of the paper, DSO is shown alongside other methods for qualitative comparisons. I am wondering how can DSO be executed in the first place because there is no camera intrinsics for the RGB images in the 7-scenes dataset. The bad camera trajectory could be due to the wrong camera intrinsics.
If you have the RGB camera intrinsics, please let us know. AFAIK, the RGB-D 7-scenes authors do not share the parameters. Thanks!